Does GPU Bandwidth Matter?
Posted March 02, 2017 by rob-ART morgan, mad scientist
Inquiring minds want to know if the eGPUs lower PCIe bandwidth affects performance compared to internal x16 PCIe slots in a Mac Pro tower. In other words, "Does GPU bandwidth matter?" To provide a answer, we reprised some GPU intensive tests using the same GTX 980 Ti GPU installed either inside or outside three Macs.
GRAPH LEGEND
cMP>x16>980Ti - GeForce GTX 980 Ti (6GB) GPU installed in the 'mid 2010' Mac Pro tower's PCIe 2.0 x16 slot #1
rMBP15>Node>980Ti - GeForce GTX 980 Ti (6GB) GPU installed in the AKiTiO Node Thunderbolt 3 eGFX Box and connected to a Thunderbolt 3 port of a 'late 2016' MacBook Pro 15-inch
nMP>Node>980Ti - GeForce GTX 980 Ti (6GB) GPU installed in the AKiTiO Node Thunderbolt 3 eGFX Box and connected to a Thunderbolt 2 port of a 'late 2013' Mac Pro 8-core cylinder
To set the stage, here is the Device-to-Host bandwidth of each example as reported by CUDA-Z.
(HIGHER megabytes per second = FASTER)
PURE GPU CRUNCH
Blender - Lets you choose either OpenCL, CUDA, or CPU to render a 3D Scene. In each case GTX 980 Ti rendered using CUDA.
(LOWER time in seconds = FASTER)
OctaneRender
This is a "GPU only" standalone renderer that can process scenes created in Maya, ArchiCAD, Cinema 4D, etc. -- and does so in a fraction of the time it takes with a CPU based renderer. However, it only runs only on CUDA capable NVIDIA graphics cards like the GTX 980 Ti. We used the DEMO version with a test scene called octane_benchmark.ocs. For our test we selected RenderTarget PT (Path Tracing).
(LOWER time in seconds = FASTER)
DaVinci Resolve - Candle project playback renders Noise Reduction Node (1NR) on-the-fly during playback. In Preferences > Video I/O and GPU, we chose CUDA for GPU processing mode.
(HIGHER frames per second = FASTER)
MIXED GPU AND CPU
Final Cut Pro X - Using a sample 1080p 2 minute clip, we rendered a Focus Blur effect.
(LOWER time in Seconds = FASTER)
Motion 5 - Render RAM Preview of Atmospheric 600 frame Template.
(LOWER time in Seconds = FASTER)
Tomb Raider Built-in Benchmark using High Preset
(HIGHER frames per second = FASTER)
WHAT DID WE LEARN?
The bandwidth of an eGPU's PCIe slot is severely restricted compared to the internal x16 PCIe 2.0 slot #1 of a Mac Pro tower. However, it does not necessarily restrict the performance of the graphics card.
If bandwidth was a critical performance factor, the Mac Pro tower would be TWO TO FOUR TIMES FASTER than the Macs with eGPUs. In reality, the performance gap was much smaller than the bandwidth gap. Take another look at the PCIe bandwidth graph below and compare that to the graphs above.
PURE CPU CRUNCH
Now for perspective, here is a graph showing how the three Macs featured above handle a PURE CPU task that uses all cores (both real and virtual).
The point is that the eGPU may close the gap between desktops and laptops for GPU intensive apps, but the eGPU won't help when it comes to pure CPU functions.
Comments? Suggestions? Feel free to email me,
Follow me on Twitter @barefeats