A look at Tegra 3, 3.3 and 4

Part 1: Tegra 3 warts lead to a new variant

Nvidia world iconWhat’s going on with Nvidia’s (NASDAQ:NVDA) Tegra line lately? Nothing good we hear in the back channels, which might be why we’re seeing more ‘unofficial’ back channel PR releases through the usual proxies.  You always know that something is going pear shaped when Nvidia PR bangs the drum this vigorously, so lets dive into the mole tunnel.

As of Computex, Tegra 3/T30 was set for the end of Q3, at least according to the people who are in line for samples. The fab side said there are two parts to the problem, that yields were awful, and performance on a per-clock basis was nowhere near what the company expected. Both are worrying for different reasons.

Let’s look at each part of the problem.  The yield issues are surprising because the chip is made on TSMC’s 40nm process, something that is a known commodity at this point. We expect this to be solved quite readily before the intended launch date unless there is a circuit design issue in the uncore. The core itself is an ARM black box, so implementing that should be about as much of a no-brainer as is possible. Then again, Nvidia is known to surprise even the most jaded onlooker.

The second part of the problem and more interesting is the IPC/performance issue. There are two candidates for this one, and SemiAccurate’s moles can’t filter out exactly where the problem lies. As we said, the core is a no-brainer, so the problem is extremely unlikely to come from that side of the chip. That leaves the uncore and interconnects, things that Nvidia can change in good or bad ways.

The biggest of these uncore ‘whoopsies’ that we have heard about is the chip only has one memory channel/controller. If true, the core count doubles, CPUs gain NEON SIMD instructions, and more GPUs are added to a memory controller already under a lot of pressure. This could easily explain the IPC problems, and does definitely explain Nvidia’s rather curious choice of demos to show the chip off with. Look for a slew of bandwidth-light demos in the future, honest benchmarks won’t end well for the green team. As an aside, the official PR problem deflection line is, “Don’t look at the count of memory controllers, looks at how efficient they are”. Heh. Sources say that ‘craptop’ makers don’t have a similar take on the situation, and the official explanation is not working out so well with that set.

If the IPC problems are not solely memory bandwidth related, the likely candidate would be the interconnect. Nvidia hasn’t exactly shown off their engineering prowess on this front in the past, and our moles in Santa Clara say that things are far from solved here.

As a sign of the extent of changes occurring, we’re hearing a new code name that keeps popping up, T33. There are three Tegra 3 variants so far, T30, T30s, and T30ab, the new versions are called T33, T33s and AP33. We hear that they are basically a re-working to increase clock speeds with an aimed for 2-300MHz, presumably at the same power levels.

Some sources say that T33 replaces T30, others say that it will be a follow up part. If it is a simple follow up, no big deal. If it is a replacement for T30, it is hard to see how this won’t slip the schedule quite a bit, but could be relevant to the IPC and yield problems. Nvidia is then looking at two choices.  Ship a competitive chip late or ship a lame duck on time.  Neither is a good option and we don’t envy the managers and bean counters making that decision.

One thing that several sources all say is that the speed bump is basically due to Qualcomm’s Krait chip. People who have Kraits in-house tell SemiAccurate that the dual core version mops the floor with a 1.5GHz quad A9, something that a vanilla dual core A15 should not be able to do. The particular A9 variant in question wasn’t named outright, but there is only one quad A9 sampling right now.

2-300MHz now sounds like a necessity to hang on to design wins, but lets see what silicon ends up being delivered. We hear ‘craptop’ makers are particularly displeased with Tegra 3, both for raw performance and for anemic I/O. More clock can only fix one of those things.

Moving on to the next chip, the T40, there is one big chunk of news. The chip that will be marketed as Tegra 4 is based on ARM’s A15 design. Our moles originally said that the T40 was more or less a shrink of T3x, so this seems somewhat surprising. Since T5x is not A15 based, it is interesting that Nvidia would put the time and effort into a new core for a SoC with a short market window.

That brings us to Tegra 5 or T50, also known as the basis for the vastly over-hyped Project Denver. Of all the chips coming out in the near future, this is the most technically interesting. On paper, it looks like quite a bold move, but as always, lets see what, and when, the company delivers silicon. This core will be covered in part 2.S|A

The following two tabs change content below.

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate