Pink Iguana

Home » Code » Numerical



Markus Puschel,  How to Write Fast Numerical Code, Spring 2014, here. Not bad.

Bordawkar, IBM, Believe it or Not! Multi -core CPUs Can Match GPU Perfromance for FLOP-intensive Application! here.

We evaluated the performance of a real-world image processing application on the latest GPU and commodity multi-core processors using a wide variety of parallelization techniques. A pthreads-based version of the application running on a dual quad-core Intel Xeon system was able to match nVidia 285 GPU performance. Using fully automatic compiler-driven auto-parallelization and optimization, a single Power7 processor was able to achieve performance better than that on the nVidia 285 GPU. This is a compelling productivity result, given the effort required to develop an equivalent high- performance CUDA implementation. Our results also conclusively demonstrate that, under certain conditions, it is possible for a program running on a multi-core processor to match or even beat the performance of an equivalent GPU program, even for a FLOP-intensive structured application. In future, we plan to compare performance of such applications on upcoming GPU architectures from AMD and nVidia, e.g., nVidia Fermi.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: