Nicole Hemsoth, HPCWire, NVIDIA Steers Roadmap Around GPU Benchmarks, here. It could be OK. Must be hard to have to put all your data and code across the PCIe when you want to execute. But Dally is there so it might be interesting.
NVIDIA’s roadmap for GPU computing revolves around resolving some of the core bottlenecks that have always existed for accelerators in terms of data movement and memory capability. In this era of “big data,” the performance levels drop off with the addition of ever-larger data streams, even with innovations that have tried to get around this by letting the GPU crunch while data movement goes on in the background as with recent efforts around direct memory access (DMA).
NVIDIA’s answer to the data movement bottleneck is found in today’s announcement of NVLink, which is its newly announced chip-to-chip communication approach that lets the GPU talk on a dedicated line with other GPUs, as well as hook directly to the CPU along unified memory lines without the weight of PCIe—which even at its best in the current 3.0 state can’t compare to what they’ve cooked. In effect, this bundle of PCIe pipes with DMA acts much like an extension of high bandwidth (and proprietary, one should add) PCIe. It splits the efficiency and performance drain of pure PCIe into components instead of running both through the same pipes. The end result, said Huang during his keynote, is a 5-12x performance improvement over PCIe 3.0 and a 4x efficiency boost.