AMD takes a major step in enabling GPGPU coding

CodeAnalyst 3.2 allows heterogenous profiling of code

Apr 14, 2011 by Charlie Demerjian

AMD just took, and then retracted, a major step forward in the whole ‘fusion’ concept, enabling profiling across heterogeneous cores. Although it may sound minor, it is a huge step forward in the usability of the whole paradigm.

Yesterday, AMD (AMD) put up a ‘blog’ post about CodeAnalyst 3.2, the AMD profiling tool that is currently on v3.1. In a very short period of time, the page disappeared, but not before a sharp eyed reader captured it. Before the conspiracy theorists go nuts, someone probably just hit the wrong button to post instead of schedule the release, so it will probably be up in full soon.

The post detailed some of the features of CodeAnalyst 3.2, including Bulldozer (12h family) support, CPU/memory utilization timelines, Visual Studio integration, and the aforementioned heterogeneous profiling. That is by far the biggest addition.

Although it may not sound like much, this part of the release is the key. “If you captured OpenCL information, that will also be shown on the timeline. The timeline has an easy navigation for zooming into the most minute call, while retaining a relative sense of the entire profile. Each GPU device with OpenCL activity will be displayed. A chart for each thread with OpenCL API calls will display the function durations, with double-click, two-way navigation to a detailed data table of the function traces. Kernel and data transfer events are logged and shown in the respective command queues, with the ability to see the latency involved with enqueued events waiting in parallel.”

OpenCL has the ability to pick a target for your code to run on, CPU or GPU, and have it ‘just work’, at least in theory. Given the disparity between CPU and GPU tools, sending things to the GPU usually meant looking at your code with the tools equivalent of welding goggles and a divining rod. Trying to find bottlenecks in your code across multiple types of execution units simultaneously made coding for the PS3 seem like light hearted fun, even counting the inevitable Sony lawsuit.

Since AMD is heavily promoting GPU compute, even holding a conference on the subject, they have a vested interests in people using OpenCL and similar technologies. If coding for a device is pain, and optimization/debugging is an advanced study in masochism, coders will just say no. AMD finally seems to understand that concept, and is actually making the tools people want and need now, with releases coming thick and fast. This is what they needed to do a few years ago, but it is still a welcome change.

From here, the next big step, possibly the final major hurdle, is to make a system that transparently parses threads to the appropriate device. To do that, you need to know what ‘appropriate device’ means, and the major metric there is performance. For that, you need a tool that can see both CPU and GPU performance counters, data transfer events, and queues/latencies. Now do you see the direction that CodeAnalyst 3.2 is moving us in?S|A

Bio
Latest Posts

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate

Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate

Latest posts by Charlie Demerjian (see all)

Intel Announces their NXE: 5000 High NA EUV Tool - Apr 18, 2024
AMD outs MI300 plans… sort of - Apr 11, 2024
Qualcomm is planning a lot of Nuvia/X-Elite announcements - Mar 25, 2024
Why is there an Altera FPGA on QTS Birch Stream boards? - Mar 12, 2024
Doogee (Almost) makes the phone we always wanted - Mar 11, 2024

Thank you, Subscribers!

Thank you to our Subscribers, past and present.

You are appreciated.

You are what keeps SemiAccurate going, what allows us to maintain our journalism, what keeps us ad-free, what allows us to tell it like it is, it is still just you. You, the reader and subscriber, we thank you.

If you want to know more about subscriptions, both free and paid, the information can be found here.

For more on our track record of leading edge journalism see Fully Accurate.
Our Writers

Charlie Demerjian is the founder of Stone Arch Networking Services and S|A.

SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, security and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture.

As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends.

Thomas Ryan is a GIS Programmer and freelance technology writer from Seattle, WA. You can find his work on SemiAccurate and PCWorld.
Tweets from https://twitter.com/SemiAccurate/lists/writers

SemiAccurate

On Target Technology News

Hot Article AMD to differentiate cores

Hot Article Intel foundry customer bails out

Hot Article Coffee Lake is going to impact Intel’s margins

Hot Article SemiAccurate digs up Intel Coffee Lake specs

AMD takes a major step in enabling GPGPU coding

CodeAnalyst 3.2 allows heterogenous profiling of code

Charlie Demerjian

Latest posts by Charlie Demerjian (see all)