NVIDIA has been circling in on ARM for quite some time and has already started promoting the compute architecture in benchmarks. An A100 GPU equipped server with an ARM and x86 CPU were found to have very similar performance (although x86 still had higher peak performance).
The eternal problem is of course the fact that while ARM beats the socks off of x86 in low power/high-efficiency scenarios (think smartphones), it is not able to scale that power efficiency to high clocks. Leakage is in fact one of the reasons why Apple’s new A15 chips have been a relative disappointment so far. Servers being the absolute edge of high-performance compute then are an area where x86 has typically reigned supreme, although NVIDIA would love to change the narrative as far as that goes. We see that ARM-based A100 server actually managed to beat x86 in the niche 3d-Unet workload while more common ones like ResNet 50 remain x86 dominated.
Of course, when you are talking inference, GPUs remain the king. NVIDIA did not hold back any punches when it pointed out that an A100 GPU is a 104x faster than a CPU in MLPERF benchmarks.
“The latest inference results demonstrate the readiness of Arm-based systems powered by Arm-based CPUs and NVIDIA GPUs for tackling a broad array of AI workloads in the data center,” he added.
Everything from the popular Image Classification ResNet-50 benchmark to Natual Language Processing was tested and the A100 GPU reigned supreme in everything. With NVIDIA facing final regulatory hurdles in their acquisition of ARM, we are going to start to see Jensen push for ARM domination in the server space and the surrounding ecosystem spring into space. While it won’t happen overnight, the first real threat to x86 as the premier compute architecture, might well be underway.
MLPerf’s inference benchmarks are based on today’s most popular AI workloads and scenarios, covering computer vision, medical imaging, natural language processing, recommendation systems, reinforcement learning and more.