While Nvidia developers see a 100x speed increase, Intel only sees 14x with some kernels using CUDA.
ZoomA recent paper written by Intel and presented to the International Symposium on Computer Architecture in France claims that Nvidia's GeForce GTX 280 GPU is only 14x faster than its Core i7 960 processor. The paper attempts to debunk claims made by Nvidia developers who saw a 100x performance improvement in some application kernels using CUDA when compared to running them on a CPU.
But is that any surprise? GPUs like the Nvidia GTX 280 have 240 processing core--the average CPU only has six cores. However it's uncertain how Intel came to its "14x" conclusion, as the findings refer to a set of unknown benchmarks--Nvidia even pointed out that they weren't specified in the paper.
"[But] it's actually unclear...what codes were run and how they were compared between the GPU and CPU," said Nvidia spokesperson Andy Keane. "[Still], it wouldn't be the first time the industry has seen Intel using these types of claims with benchmarks."
Playing on the paper's title--Debunking the 100x GPU vs CPU Myth--Keane said that the real myth is that multi-core CPUs are easy for any developer to use and see performance improvements. "In contrast, [our] CUDA parallel computing architecture is a little over 3 years old and already hundreds of consumer, professional and scientific applications are seeing speedups ranging from 10 to 100x using Nvidia GPUs."
Naturally Intel retaliated, saying that Nvidia had taken one small part of the paper out of context and even added that GPU kernel performance is often exaggerated.
"General purpose processors such as the Intel Core i7 or the Intel Xeon are the best choice for the vast majority of applications, be they for the client, general or HPC market segments," said an Intel spokesperson. "This is because of the well-known Intel Architecture programming model, mature tools for software development and more robust performance across a wide range of workloads--not just certain application kernels."
Intel has reportedly acknowledged that application kernels run up to 14 times faster on an Nvidia GeForce GTX 280 compared to a Core i7 960 CPU.
According to Nvidia spokesperson Andy Keane, the kernels were likely tested by Intel on the previous-gen GTX 280 GPU without any optimizations.
Is Nvidia's GeForce 14X faster than Intel's Core i7?"[But] it's actually unclear...what codes were run and how they were compared between the GPU and CPU...[Still], it wouldn't be the first time the industry has seen Intel using these types of claims with benchmarks."
Keane explained that the above-mentioned stats were presented in an Intel paper titled "Debunking the 100x GPU vs CPU Myth" at the International Symposium on Computer Architecture (ISCA) in Saint-Malo, France.
"[Yes], is indeed true that not *all* applications can see this kind of speed up, some just have to make do with an order of magnitude performance increase. But, 100X speed ups and beyond, have been seen by hundreds of developers," Keane told TG Daily in an e-mailed statement.
"[So], the real myth here is that multi-core CPUs are easy for any developer to use and see performance improvements. In contrast, [our] CUDA parallel computing architecture is a little over 3 years old and already hundreds of consumer, professional and scientific applications are seeing speedups ranging from 10 to 100x using Nvidia GPUs."
Unsurprisingly, an Intel spokesperson TG Daily that Nvidia had taken "one small part of the paper" out of context.
"While understanding kernel performance can be useful, kernels typically represent only a fraction of the overall work a real application does. As you can see from the data in the paper – claims around the GPU's kernel performance are often exaggerated.
Intel Core i7"[Now], general purpose processors such as the Intel Core i7 or the Intel Xeon are the best choice for the vast majority of applications, be they for the client, general server or HPC market segments. This is because of the well-known Intel Architecture programming model, mature tools for software development and more robust performance across a wide range of workloads - not just certain application kernels.
"[Yes], it is possible to program a graphics processor to compute on non-graphics workloads. But optimal performance is typically achieved only with a high amount of hand optimization, require graphics languages similar to DirectX or OpenGL shader programs or non-industry standard languages.
"For those HPC application that do benefit from an extremely high level of parallelism, the Intel MIC architecture will be a good choice as it supports standard tools and libraries in standard high level languages like C/C++, FORTRAN, OpenMP, MPI among many other standards."
No comments:
Post a Comment