Nvidia’s Maxwell process choice

Analysis: 28nm versus 20nm in 2014

nvidia logo 87x19 Nvidias Maxwell process choiceThe first few scraps of information about Nvidia’s Maxwell generation are coming out, and they have some really serious implications. The news itself is not that earth shattering, but it brings sharper focus to TSMC’s 20nm process.

First the news. Maxwell is the generation of Nvidia GPUs following the current Keplers, and Kepler still has a refresh called GK114 coming out in the spring of 2013. Maxwell is due after that, historically a fall schedule in Q4/2013, but current release schedules seem to indicate that the entire industry has slipped to spring releases. That means Q1/2014 is much more likely as a release date.

So what’s the news? Nvidia is internally hoping for an 80% performance increase, said to be performance per watt. While this seems a bit unlikely, especially in light of the claims for GK114 vs where it ended up, at least you know the range they are aiming for. The more interesting news is that the initial Maxwell parts will be made on TSMC’s 28nm process, not 20nm as is widely rumored.

28nm means that Maxwell will move GPUs right back to the big die strategy for consumer parts. Since there isn’t much more efficiency to be had from a 28nm transistor after the GK114 generation and the basic shader designs are already fairly optimized, any gains will likely come from more efficient scheduling and power management. That is the long way of saying that there are unlikely to be any big bangs, just lots of little ones squeezing a much bigger chip.

So what comes into focus? That is the more interesting part, and it centers around why Nvidia would choose a 28nm process for this part. Why would they still use that process in 2014? Could they not have figured out their process problems yet, and know it? What does this have to do with Apple, AMD, and even Qualcomm? Good questions.

The most obvious implication in all of this is that Nvidia is not confident that they can make a big chip on a new process and have it yield at marketable levels. Early 40nm woes for the industry were solved in due time, mainly the nitrogen leak at TSMC was plugged, but we are not supposed to know that, and things were peachy from then on. Unless you were Nvidia, then it was all TSMCs fault, and still is.

Most industry observers seemed to think that the problem was worked out and yields were OK after a few quarters, mainly because Nvidia profits did not take the expected ding. A sign that they were not OK was chronic product shortages, coupled with almost all variants having some disabled units. We disagreed with the analysts, and now you know why.

In the end, we were right, yields were abysmal, and stayed that way. Nvidia never got a handle on 40nm for the GT line. How did that not show up in the financial reports? Easy, the deal Nvidia had made with TSMC was to pay for only good die, an almost unheard of situation in the industry. Nvidia never had to fix their yield issues, they could just run more wafers at low yields because TSMC was eating the loss. It wasn’t their problem, the financial reports continued to look good.

This was hidden from the public and analysts until much later when it was revealed deeply buried in a 10-K form. Refreshing as this news was, it was a mere footnote by that time, the world had moved on to 28nm. By that time, Nvidia was saying that TSMC 28nm had no problems, everything was fixed, and they were going to show the world their new-found process prowess. SemiAccurate again disagreed.

Then Nvidia was asked asked by an analyst they swore 28nm had no problems, then they couldn’t say the same things that were said to the press, 28nm was not going well for them. SemiAccurate has been chronicling Nvidia’s process woes for years, and the underlying problem is both simple and as yet unaddressed, management. Until management changes, their process woes are unlikely to be fixed, and that means 20nm problems are extremely likely. Moving back to Maxwell, does it being built on a 28nm process indicate that management has finally come to their senses?

If you have followed Nvidia for any length of time, you probably realize that this is unlikely to happen, the company has a track record of blaming everything on external problems. Nothing is ever their fault, and conspiracies abound. SemiAccurate rates the chances of Maxwell being on 28nm because of an engineering choice at almost zero. We also feel that 20nm will go no better for the company than the last two processes did, the internal problems have simply not been addressed.

So why did Nvidia stick with 28nm? The next thing that pops to mind is whether TSMC’s 20nm process will be ready in time for Maxwell in the spring of 2014. TSMC is claiming they are running 20nm test wafers now. Global Foundries is also running 20nm test wafers. Will it be production ready in time for Maxwell? In volume? If Maxwell is released in Q4 of 2013, the answer is almost assuredly no. If Maxwell is a spring of 2014 product, the answer changes to possibly. So what’s the problem? Apple. Then AMD, followed closely by Qualcomm.

As has been rumored for a while now, Apple is moving some if not all production away from Samsung at the 20nm node, and TSMC is the beneficiary. If Apple does as Apple usually does, and TSMC is as covetous of their business as we are told, then any deal signed will give Apple preferential treatment for supply. Apple needs a lot of wafers, and Apple doesn’t like to be told no. They also don’t like suppliers that don’t deliver, usually a fatal transgression at 1 Infinite Loop.

Since Apple is now the customer, singular, for TSMC at 20nm, the others will have to sit and wait hoping some scraps to fall off the table. Apple is unlikely to let any fall, especially if those lining up for scraps include two competitors, Nvidia and Qualcomm. Every 20nm wafer these two don’t get is a few hundred Android devices that do not get a CPU. That makes Apple happy. This is a very basic supply chain game, and Apple is very adept at it, has the buying clout to back it up, and the moral fiber to use it.

AMD is less of concern to Apple, they have nothing that competes with Apple’s products, only chips that go in to them. Of the big semiconductor houses, AMD is just about the only one that can be considered friendly, or at least not an enemy to Apple. While this won’t mean Apple helps AMD out, it does put them at the the top of the preferred outlet list for any wafers Apple has to spare.

No matter who could possibly use those wafers, only AMD, Qualcomm, and Nvidia are potential TSMC 20nm customers and would consume large volumes of leading edge wafers. The rest are unlikely to have any designs ready, mainly because they didn’t plan on early 20nm capacity. Nvidia is a direct Apple competitor, Qualcomm is on some lines, but AMD is not other than tangentially. If Apple has any say in early 20nm wafer allocation, that tells you their likely preference. On the other hand, they could just buy up every 20nm wafer TSMC makes for a year and dictates who gets them by directly reselling to only the preferred buyer. Given how Apple has worked in the past, this is a likely scenario.

In the end, Nvidia is promising Maxwell on 28nm, not 20nm, and with healthy performance gains. This strongly indicates a return to big chips, and the commensurate costs associated with them. It also hints at an inkling of management insight, but that is not likely the case. In the end, Nvidia went 28nm because it was the only path available to them. Apple likely bought up the entire run, and Nvidia was, to use industry jargon, SOL. If nothing else does it, Apple is going to effectively delay 20nm chips for everyone, the only question is for how long.S|A

The following two tabs change content below.
 Nvidias Maxwell process choice

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also a council member with Gerson Lehman Group.