HAVING HAD more than my fair share of conversations with Nvidia personnel, one thing became painfully clear – they just didn’t get the concepts of reliability and testing. This became painfully clear with their flailing over the bumpgate fiasco.
The most painful conversation in regard to the whole bumpgate fiasco was with a very senior Nvidian who flat out insisted that the company did not understand why the failures were occurring. The talking point was that it was a new area of science, and not understood by anyone. Having talked to five separate packaging engineers who all understood the problem, and gave me the exact same reasoning for its occurrence, I was confident that the three part article I wrote on the failures (1, 2, and 3) was technically spot on.
Originally I thought they were just trying to make me believe their version of the truth, but I later found out that was not the case. This dawned on me when I noticed Nvidia was hiring a lot of thermal engineers in late 2008 to solve the problem retroactively. The disconnect was that the bumpgate failures were not due to a thermal problem per se. The chips did not cook to death.
Nvidia’s bumps were cracking due to repeated thermal stresses, leading to physical failure. Hiring thermal engineers to fix that is much akin to hiring more gas station attendants to improve a car’s fuel economy, it was addressing the wrong problem. The ‘fixes’ showed some creative but problematic engineering choices. By this point, things had gone from morbid humor to rather sad. Flailing is fun to watch for a bit, but it gets old, even if you call it dancing in a press release. The short story is that Nvidia was not testing properly, and the result was dead chips. That is the ‘science’ that the company didn’t understand, and while it’s implementation may be complex, the concept most definitely is not.
So it comes as no surprise that Nvidia is finally looking to hire the right person to fix things, almost three years after the problem was first noticed. On July 2, 2009, the date being ironically a year after the notorious 8-K that publicly kicked off bumpgate, the company put up a job listing for a “DIRECTOR OF PACKAGE TECHNOLOGY”. Finally, literally years later, the company got it!
If Nvidia can find someone to sit in that particular hot seat, then in a year or so, it might have a handle on its package problems. Prospective candidates please note that expert testimony skills in class action lawsuits are not listed as a requirement, but if you bring it up during the interview, it might help you out.
On a related note, Nvidia is also looking for a “DIRECTOR OF GPU SOFTWARE“. Looks like the Vista crashing and crashing and crashing was finally noticed too. Since this job was posted the day before, there must have been something in the Santa Clara water this July. Maybe the municipal water district mixed up the Intel ‘fluoride’ feed with the Nvidia one?
In any case, we are glad to see that Nvidia is finally allowing the clues to sink in, even if they had to be applied with a $319 million bat. So far. With two directors in one week, and another earlier in the year, the times they are a’changin at Nvidia.S|A
Latest posts by Charlie Demerjian (see all)
- Thing go bump(gate) in the night for Nvidia’s GP100 Pascal GPU - May 3, 2016
- Cavium’s Octeon TX blends compute and packet moving - May 2, 2016
- Qualcomm releases Zeroth API to developers - May 2, 2016
- Another detail about Qualcomm server SoCs revealed - Apr 27, 2016
- AMD finally really honestly launches the dual Fury - Apr 26, 2016