A few weeks ago, SemiAccurate speculated about Nvidia’s 28nm yields based on their recent roadmap changes. There were two possibilities as to why a consumer GK110 card would be added, really good yields or really bad yields.
Literally the day after we published, the always interesting C’t magazine got their hands on the specs for the Tesla K20, the big compute version of the GK110. You can find the full story in German here, and English-ish here. The short spec list, 2496 shaders arranged in 13SMXs, 705MHz, all pulling 225W.
Compare and contrast this to the number of physical SMXs on the chip, 15, and the number of shaders, 2880. That means the GK110 productized as the K20 only has 13/15ths of the GPU working, two SMXs are fused off. Power usage is less than expected, 225W for 13SMXs @ 705MHz is a bit better than we would expect since the GK104 pulls 195W for 8SMXs at 1006MHz. So power use is less than feared but yields are worse than you would expect.
Back to the original premise, you can read a lot about yields into the choices Nvidia made for the K20. First is that they had to fuse off two full SMXs, so granularity of design and repair-ability is still not considered a good thing on one side of the San Tomas Expressway. That is something we can’t explain. Next up is that the yields are horrific.
We say that because the K20 is probably the lowest volume GK110 variant, the consumer versions will sell an order of magnitude or more units than a $3000+ engineering tool. Since yields generally fall along a bathtub curve, if the consumer card could be done with 15 active SMXs, pulling enough off the line for the same number in a card costing many times that should be easy. In fact, they could possibly pull out a sub-bin of low power chips that would make them look better. If there are enough parts for a 14SMX consumer part, a 15SMX K20 should be an easy one to make. You can see the pattern enough to know that a 13SMX compute card means yields support a 12 or 13SMX consumer variant.
Clocks are also fairly low, lower than expected. The reasons for this have more to do with the binning choices Nvidia made on the GK104 than anything about the GK110. That said, the low clocks are probably a large chunk of how Nvidia could keep the power usage so low.
What the end result is won’t be known until the consumer parts come out, but the pointers are very clear. Based on what the GK110 is capable of, the specs of the single released high end card, and the previous Nvidia 28nm woes, it looks like the company has yet to figure out it’s systemic process problems. The more things change…….S|A
Latest posts by Charlie Demerjian (see all)
- Globalfoundries 7nm process isn’t even close to the name - Sep 26, 2016
- ARM upgrades realtime offerings to v8-R and adds Cortex-R52 - Sep 21, 2016
- Everspin and Globalfoundries team up for embedded ST-MRAM - Sep 15, 2016
- Intel’s Xpoint is pretty much broken - Sep 12, 2016
- ARM adds 2048-bit vectors to v8A with SVE - Sep 7, 2016