AMD (NYSE:AMD) is finally bringing their FirePro professional line into the VLIW4 era with two new cards, the V7900 and the V5900. Both are based on the ‘Cayman’ ASIC that powers the current HD69xx line of graphics cards, but the two cards end up quite different.
The new cards are both mid-range parts, the previous Evergreen based FirePro series ran from the V3800 line to V9800, the new ones add 100 to the mid range V7800 and V5900 GPUs. So far so ordinary. The interesting bit is that they both are Cayman/VLIW4 chips, not the older 4+1 shader setup. For gaming, this change was a wash, but for ‘professional’ use, the story changes radically.
The lineup itself, both of them
Short story, the V7900 is fused off to 1280 shaders, down from the 1536 shaders of full Cayman part. The large chip boasts about 2.7TF in SP, 683GF in DP, the variant in the V7900 has 1.86TF and 464GF respectively. This puts the clocks of the V7900 at about 730MHz, and the memory bandwidth is down a bit, 160GBps vs 176GBps in the HD6970. The V5900 is, well, mostly fused off, it only has 512 shaders putting it at 610GF SP and 141GF DP performance. Memory bandwidth is down to 64GBps as well. This is bad, right? Actually not, there is a really good reason for this neutering.
The actual cards
So, why is cutting between 15 and 66% of the shaders off a card while lowering the clocks a good thing? It isn’t what I am smoking, not anything in the water, it is simply power, electrical power. As you might have noticed, the cards above are both 1 slot form factors, and both come with 2GB of GDDR5. The V7900 has only one 6-pin power connector and a max power draw of less than 150W. The V5900 has _NO_ PCIe power connectors, it pulls less than 75W. When was the last time you saw a >1TF compute device fit in one slot? No, not those finicky toys, a real world useful >1TF compute device.
The idea was simple, to allow the user to cram lots and lots of these cards in to a small space while still being able to power it. If you compare that to a ‘247W’ dual slot Tesla that only puts out 1.03TF SP but has a slightly better 515GF DP, there really is no comparison in performance per Watt. GPU compute is all about power density, and these cards are a clean kill. Factor in the price tags of $999 for the V7900 and $599 for the V5900 and it is game over.
So, the new FirePro cards have an untouchable performance per Watt advantage, but is that it? Short story, no, that is just the start. Two other features carry over from their 69xx brethren, Eyefinity and custom color correction/monitor calibration. For the home user or gamer, these are nice but not must have features. For someone working on design, publishing, animation, or modelling tools, they are absolutely indispensable.
Dassault Catia, Abaqus, and Isight
At a briefing a few weeks ago, the company brought out two partners to show off how Eyefinity works in the high end space. The demo above consisted of three pieces of Dassault software, Catia, Abaqus and Isight, all made in to one seamless workflow system. Instead of one or two monitors, you can have all three parts open, running, and accelerated at once. The same type of layout was shown with 3DS Max, Mudbox and Photoshop, everything you need at once.
Speaking as someone who has two 30″ monitors on his desk, and is trying to justify moving the third one from the gaming test rig to the work machine, I can say without hesitation that more monitors does increase productivity. It makes things SO much easier, more so if you are using a broken OS without a workspace switcher. This is a killer app in the ‘professional’ graphics space, no question there.
If you look at the physical cards, you will also notice a few details. You can’t see it in the picture above, but the V5900 has one DVI and two DisplayPort connectors while the V7900 has four DisplayPort outs. AMD only lists this as three and four monitors, but since both cards support DP1.2 natively, you can chain up to three DP monitors on each out. Both cards should support six monitors with ease.
On the top edge, you may notice that the V5900 has one crossfire connector and the V7900 has two. That means two V5900s can be chained together or if you really are in the mood, four V7900s. The V7900 also has full genlock/framelock capabilities, a first for this class of cards. You can see the connector just under the Crossfire ports. Does 24 screens synced to an external time source sound appealing to you? If you need anything near that number, the V7900 is tens of thousands of dollars cheaper than any existing solution, and probably a lot less finicky too.
Not wanting to pass up an opportunity to put the boot in to a prone opponent, AMD nicely pointed out that you can’t do three monitors, much less six, on Nvidia cards. To do three monitors, you need two GPUs, two slots, more power, and a lot more cash. In addition, you don’t have the seamless multiple screen spaces with Nvidia, so thing like maximising windows or video across screen boundaries will likely to lead to hilarity. Factor in that a FirePro card is cheaper than a single comparable Quadro or Tesla, and….. boot.
The other must have feature is independent screen colour calibration, independent colour gamut, and more knobs to tweak than you have time to turn. Again speaking as someone with three different 30″ panels in his lab, this is handy for even the non-‘professional’ user. If you are doing anything that requires real colour calibration, this feature is not only handy, but allows you to add monitors without worrying about minute panel differences on the same line, a real headache in the print world. You can take any two monitors of any size, resolution, or capability, and just add them on. This is a lifesaver, it allows users to add monitors without replacing the existing ones.
The list of features carried over from the HD69xx chips is too long to list, but if you want more depth on Eyefinity, colour calibration, or PowerTune, take a look at the linked articles above. In addition, AMD has coined the term Geometry Boost to refer to the dual front ends in Cayman. For gaming and demos, showing off geometry performance was an exercise in how many times you could subdivide a polygon that already was smaller than a pixel, leading to some very dubious performance claims. For the professional market, it could very well make a big difference, sub-pixel rendering does have some useful applications.
The last piece of the puzzle is the VLIW4 architecture. Earlier, we said it was more or less a wash in consumer apps, some games were a hair faster, some were a hair slower, and overall performance stayed about the same overall. A VLIW4 shader unit is however about 10% smaller than the older 4+1 setup, leading to more shaders per unit area. This is never a bad thing, but it is hardly a killer feature, especially in light of a new driver stack.
The drivers are maturing with every monthly release, so the gaming benefits of VLIW4 are becoming more apparent every day, but are not enough to make a fanboi swoon. On the professional side however, the architecture really makes things fly. The old T-Unit (the + 1 in the older 4+1 architecture) is no more, and each of the 4 shaders in a VLIW4 cluster can all do the same thing in parallel. This vastly simplifies drivers and shader compilers, adding up to more efficient use of resources, not to mention lessening coding headaches. VLIW4 should shine in professional apps, that is what it was designed for.
Both of these cards are available now, and are only the tip of the iceberg. No company launches a midrange only range of cards, so expect many of these capabilities to migrate up to the inevitable V8900 and V9900 cards, then down to lower lines, and finally in to mobile workstations like the Dell M6600. It is going to be a while before Nvidia can match any of these features, and the singing moles of Sunnyvale (Subterrania soundofmusicus) say that there is more to come soon.S|A
Updated: Fixed naming errors, V9800 -> V7900
Latest posts by Charlie Demerjian (see all)
- Globalfoundries 7nm process isn’t even close to the name - Sep 26, 2016
- ARM upgrades realtime offerings to v8-R and adds Cortex-R52 - Sep 21, 2016
- Everspin and Globalfoundries team up for embedded ST-MRAM - Sep 15, 2016
- Intel’s Xpoint is pretty much broken - Sep 12, 2016
- ARM adds 2048-bit vectors to v8A with SVE - Sep 7, 2016