In a widely reported GPU roadmap update during GTC, March 2013, Nvidia announced the new Volta family of GPUs. This was seen as a triumph by most of the press, but in reality it raises two red flags, one technical and one financial.
On the surface Nvidia is doing something no one else has publicly announced with Volta, putting stacked memory on a GPU package. Since no one else has publicly announced this, to the technically illiterate it sounds like a big deal. If you understand the technology however, it is the opposite, a very public acknowledgement of the abject mess the company is in. The process woes that Nvidia had in the past seem to still be in force.
Volta mockup shown at GTC
Traditionally, Nvidia has a roadmap that includes minor freshening every year, a big update every two years, and a major architectural change every four years. This has gone on for quite a while, at least since the GT200/Tesla days. GT200 was a new architecture and had a mid-life update and shrink from the GT20x to the GT21x line. Fermi aka GK10x was the two-year big update to Tesla and was updated to the GF11x line after a year. Tesla and Fermi are a single underlying “four-year” architecture with the latter having a massively updated uncore and interconnect network as the main new feature.
The new roadmap includes Volta
Kepler is a similarly new architecture to Tesla with a key differentiator being the loss of Tesla/Fermi’s “hot-clocked” shaders. Kepler and was meant to have a follow-on update to the current GK10x family called GK11x due to to arrive about now. It was summarily canceled quite recently with minor tweaks to GK10x replacing it. There was new silicon called GK11x on the roadmaps given out to partners and OEMs last fall, and they are not there now. Also note that the GK110 part marketed as Titan is a completely different family to the smaller GK10x and GK11x lines, names aside, they are very different lines.
The next major two-year update is called Maxwell and it brings unified virtual memory according to Nvidia’s CEO Jen-Hsun Huang at the GTC keynote linked above. No word as to how that will work with a PCIe latencies and the lack of a CPU on the same die, but that is another issue entirely. What you should note is that it is a mild refresh in the same way Fermi was to Tesla. The follow on to Maxwell, Volta should be the next new four-year architecture.
With that background in mind, why are we saying Volta is not a positive? New architecture, stacked memory that no one else has announced, and all of this due in 2016 or so? The first problem is stacked memory. Nvidia is actually behind on this curve, the two years our packaging moles had been saying for a while now has stretched to four years or more. If you doubt this, take a look at the date on this story.
AMD stacked memory working in 2011
That picture we published in October of 2011 showed a prototype AMD device with stacked memory. We didn’t give many details at the time. However, now that much of it is history we don’t think there will but roast mole if we share a bit more about AMD/ATI and their stacked memory program. The first AMD product slated to have stacked memory was Tiran, the HD89xx GPU that was due in 2012. This was meant to be followed by the Kaveri CPU, then due out in late 2012, and Hawaii which we officially know nothing about, really.
Note the top right GPU, the rest removed by SemiAccurate
None of these products ended up with stacked memory, Tiran never came out, the version of Kaveri that was way ahead of the packaging game was canceled with the new version then delayed by a year or more, and word has it that Hawaii no longer has stacked memory as an option. Before you bring up issues like yield, technical snafus, and unconstrained cost, our moles were very clear that all the technical and cost issues had been solved before Tahiti, it could have been a viable and cost effective product if AMD product planners chose to bring it to market. For reasons out of the scope of this article, they didn’t.
The take home message is that AMD had stacked memory on GPUs ready and waiting in late 2011. They still have that knowledge and can roll it out at any time for any product they choose to, top to bottom, mobile or discrete. AMD did not crow about it years prior to Tahiti, they just did the engineering quietly in a lab somewhere and worked all the kinks out. Nvidia is now talking about doing the same in with Volta in 2016?
One company was showing off functional prototypes over two years ago, one company is not able to show anything more than a diagram that may see the light of day in three years if all goes well from here. Volta is not a technical triumph, it is a red flag of process and packaging problems waved over their base technical competence. You might recall this, this, and this as the last time there was a packaging issue at the company, the stakes are much higher now.
At GTC, Nvidia’s CEO got up on stage and in quite direct terms said, “We are four+ years behind the competition” but couched it in technical terms to sound like a triumph of technical ability. Those who work in the field were indeed stunned by the admission, but not for the reasons Nvidia was hoping to convey. This appears as if Nvidia is slipping dangerously far behind in areas of basic tech needed to be competitive mid-term. Unfortunately that is not the full situation.