DataTau | Seeing as you specifically asked about GPUs (not ASICs/TPUs etc.) and cost and s...

DataTau

3 points by Ra 2553 days ago | link | parent

Seeing as you specifically asked about GPUs (not ASICs/TPUs etc.) and cost and scaling aren't part of the issue, then, for a single 'unit,' you can't really go past NVIDIA's DGX-1 system:

http://www.nvidia.com/object/deep-learning-system.html

It's 8 Tesla P100 GPGPU units (i.e. modern awesome GPUs) tightly bound via a NVLINK high-speed interconnect and managed by dual 20-core Xeon E5-2698s, with a 8 TB SSD (in RAID0 configuration, generally resulting in much higher read performance). The interconnect is very important, as I/O is most often a bottleneck, depending on your application.

Note that the benchmarks given in the (nice) comparison by Tim Dettmers notably do not include the more performant Tesla P100 cards, which the DGX-1 is based on. But some good points are made about bandwidth and cost.

However, this is all without knowing what your application is. If scaling is likely to be an issue and your peak load is going to be one-off, then a cloud solution such as AWS (an EC2 P2 instance: https://aws.amazon.com/ec2/instance-types/ ) might serve you better.

RSS | Announcements