Imagination makes the case for mobile OpenCL

GDC 2012: More speed, capabilities, AND battery life

Imagination LogoImagination was showing off GPU compute on a cell phone chip at GDC, physics in your pocket. If you think that upcoming GPUs will allow this to be reality, it was running on a current TI OMAP 4430 chip.

The idea is the same one that everyone is talking about, offload certain functions from a CPU to a GPU to speed them up and possibly save some energy while doing so. In a desktop, CPU power is in abundance, and power use is not really a big concern. While laptops are a little more energy conscious, anything that will tax a laptop CPU for long time periods will probably be done while plugged in. This means that GPU compute and OpenCL are used mainly to address problems that CPUs are really bad at, or just take a long time.

Chips that are aimed at mobile phones however are very concerned with energy use. CPU power is adequate, but it is still a small fraction of what is available in a modern desktop. If you can do something that speeds up computing and saves energy, it is a clear win here. Expanding the envelope of what is possible is a bonus, but that it is secondary. Imagination was showing off all three, faster solutions, things that were not possible the ‘old way’, and did both while using less energy. The first demo looked like this.

Imagination OpenCL compute demo

Pandaboard running OpenCL cloth demo

That demo was pretty simple, take a Pandaboard with a TI OMAP 4430, a dual-core ARM A9 CPU and an Imagination SGX540 GPU, and run a cloth simulation on it. Not only could the OpenCL version exploit the GPU to do more balls and sheets than the CPU version, but it saved power while doing so. How much? On one CPU, the simulation took about .68A@5V to run at 14FPS with 100% CPU load. With two A9 cores loaded, the Pandaboard pulled .84A and ran at 24FPS. In OpenCL, CPU load dropped to less than 30%, FPS jumped to 42, and power draw went down to .60A. More than 10% less energy used, 3x the frame rate, and you could run more simulations on the same box if you wanted. Not bad at all.

Imagination OpenCL filter demo

OpenCL image processing filter demo

There was also an image processing filter demo running on the same platform that showed similar results. Instead of higher FPS numbers, the OpenCL version did the job quite a bit faster, applying the filters in notably less time. In case you didn’t see where this was going, energy use went down by a similar amount to the cloth simulation. Unlike the cloth demo however, the image filter demo is directly applicable to photos and video recording, it can cut down on power while taking better pictures.

Imagination was quite proud of their GPU compute and OpenCL capabilities on the SGX540. In the environments that this GPU core is used, basically phones and tablets, power use is the key factor. Here, OpenCL can produce better results faster, and extend battery life too. What’s not to like? If you think this is good, wait until you see what insiders are saying Rogue can do here, think leaps and bounds.S|A

Editors note: You can learn more about this type of material at AFDS 2012. More articles of this type can be found on SemiAccurate’s AFDS 2012 links page.  Special for our readers if you register for AFDS 2012 and use promo code SEMI12, you get $50 off.  Other ways to have fun is to play the easter egg hunt there is a bonus just for playing.

The following two tabs change content below.

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate