Today Intel is finally talking about the new Core M line of processors, formerly the Y-Series of ULV Broadwell CPUs. Other than the name, SemiAccurate thinks this new chip has some very cool engineering features.
Intel is sadly only talking about the Broadwell-Y CPUs, basically the ULV variants that are now named Core M. This noxious marketing game is sadly not the only one, today Intel is only talking about Broadwell-Y in vague terms, nothing like speeds, real TDPs, and specs were given. Brace yourself for another endless launch with attempts to grab headlines at every opportunity. For the next 12 months. Shoot me now.
This isn’t to say there was no data or information released, there was but the way it was done once again snatched defeat from the jaws of victory. Instead of thinking about the things they wanted us to, the author left the briefing wondering why there were hiding so much. Lets dive in to what they did say, there is some cool tech under the hood. You will have to wait a long time for the numbers you want though.
As you may know, Broadwell is the Tock, or is it Tick, to Haswell’s Tick. Or Tock. Whatever it is, Broadwell is mostly a shrink from 22nm to 14nm, a process change we discussed here. That change has shrunk the -Y variant to 82mm, something they didn’t want to disclose but admitted after a clever journalist figured it out from the other numbers presented. It is a pretty impressive shrink all told, quite in line with the 14nm process promises.
The -Y Broadwells are 2-core only, you will have to wait 11 months to see a 4-core variant and 7 months to see the GT3 version. This variant has 24 EUs in the GPU up from 20 in Haswell. The TDPs are also not disclosed even though the entire talk was about fanless devices enabled by the new power savings tech. By the time these numbers are released it will be hard to care what they are, something that is quite possibly intentional. SIGH.
Intel started out by saying that fanless designs require a 3-5W TDP, intoning that the Broadwell-Y SKUs can have a 3-5W TDP. They don’t but an OEM has the ability to limit any CPU to that range for thermally significant time periods, the actual TDPs are in the 15W range. This is a very useful thing though, one of the main design goals for Broadwell was widening the dynamic range of the CPU, something that Intel appears to have done well even if Intel wouldn’t give out numbers.
First up on the list of power savings tech is the 14nm process, something covered in another article. Next up is packaging and this is quite an impressive technology leap. On the surface the package goes from 40x24x1.5mm in Haswell-Y to 30×16.5×1.04mm in Broadwell-Y, 50% smaller in XY, 30% smaller in Z. Part of the change comes from a smaller .5mm ball pitch, some from a thinner package, some from a thinner die, and some from a tech Intel calls 3DL.
The thinner die is the easiest to explain, they basically back ground the wafer to 170nm versus the 200nm in Haswell. That accounts for some of the Z-height savings but much of the rest comes from thinning the package core from 400nm to 200nm. This is a nice chunk of savings but it also means that the inductors in the package are now too thick to fit in the new 200nm package. Intel solved this with what they call 3DL.
No pictures allowed so we have to use stock, sorry
3DL takes the inductors and mounts them on a PoP-like PCB that is mounted on the bottom of the Broadwell-Y package. Actually there are two PCBs mounted on the bottom for 3DL and they require a hole to be cut in the motherboard, a very expensive change for OEMs. The height of the 3DL module is thin enough not to be a problem even on the thinnest tablet and phone boards, no worries there. Intel also took the opportunity to mount caps beside the 3DL boards to take advantage of the area provided. In short this is a brilliant technical exercise that elegantly solves a Z-height challenge in a unique way. That said all is not roses, it unquestionably costs quite a bit, drops yield, and in general is not something any sane company would choose to do if there was an easier solution.
So why did Intel do it? They stated a “thinner is better” arms race for their moronic Ultrabook program, something they took to its logical conclusion and didn’t stop there. “Cool” in PCs is now thin and we have long ago passed the point of sanity, now we are cutting usable volume for batteries, ports, and cooling solution heights to simply dumb levels. The user gets a more expensive, heavier, slower, and shorter battery life device all for a form factor more annoying to use than a slightly thicker one. Intel claims <9mm form factors are possible with Broadwell-Y, a technical feat that while impressive actually hurts the user. That is now progress, can we set fire to the marketing buildings soon?
Back to the technical side of Broadwell-Y, Intel is on the second generation of FIVR, their name for integrated voltage regulators. The second generation has a very interesting trick, yes it does more than capitalize the I this time around. As you may know, the voltage/frequency curves have ranges where they are efficient and ranges where they aren’t so efficient.
At less than 1V, FIVR is less efficient than Intel would like so they put in a bypass mode. The voltage rails feeding the CPU are run from an external VR that runs more efficiently than the FIVR at these low voltages. When Broadwell-Y needs voltages in these ranges they just bypass the FIVR and run directly off the external voltage feeds. This saves energy in ranges where the internal VRs are not efficient, once again quite an elegant solution to a tough problem.
Even with the FIVR bypass there is a limit to how low a voltage you can run at. Barring near threshold voltage operation like in Claremont, you can’t go much lower than 1V. Even if you can the savings for doing so aren’t that good and clocks have a floor as well. If you want to go below that you need to resort to tricks, something Intel did with duty cycles. The idea is simple enough, if you want to run at a lower voltage and frequency, turn the device off for a percentage of the time.
This means if you want to run at half the speed and half the voltage, transistor technology won’t allow you to scale to both that point and multi-GHz top ends. Instead you run it at the voltage floor and lowest sane clock and force sleep modes for half the time. Averaged over a long time period you burn half the wattage and execute half the instructions of the ‘lowest’ possible frequency. Both the CPU and GPU can be duty cycled.
These duty cycles can be extremely granular but Intel implemented it as roughly five steps set about 12.5% from each other. To the end-user a duty cycling Broadwell-Y only appears to run at a slower base clock and pull less energy, the cycling is happening far to quickly to be visible. One technical caveat is that Broadwell-Y runs all cores at the same clock but can vary voltage an P-States across the full range of options. This is almost assuredly to speed up sleep and wake latencies to make duty cycling a sane proposition.
Another bit of power savings comes from I/O, a traditionally high wattage consumption area. Broadwell-Y now allows for I/O throttling to save power, both to the chipset and to devices. The chipset is also said to drop idle power by 25% and active power 20% while adding features. The main one of the new features is a DSP for wake on voice, essentially a smart sensor hub that is mandated by most OS vendors. Other than that NVMe/PCIe storage has been ‘added’, something we told you about almost two years ago, and yes it is still a 32nm device.
Broadwell die, at least the 2-core one
Next comes the GPU, a new generation device that was meant to be the big performance boost in Broadwell. Times change. With the 14nm shrink comes a new GPU architecture and this one is definitely updated. Instead of 20 EUs in a 2×10 configuration, Broadwell-Y has 24 EUs in a 3×8 configuration. It also has two samplers that give the GPU 50% more throughput, 2 pixels per clock versus one previously. For the rest of the architecture everything is beefed up and most important standards like OpenGL 4.3 and OpenCL 2.0 are now supported.
On the media side things are a bit tricky. Broadwell supports 4K video on internal and external monitors. H.265 encode and decode is a ‘hybrid’ approach, this means it is not all in hardware yet but some functions are amenable to hardware acceleration, the rest are done in software. The same holds true for most cell phone chips on the market so Intel is far from alone here. The media samplers also have 50% greater throughput and the video engine has twice the performance as Haswell.
Why do we say 4K is a tricky question? Mainly because while Broadwell can do 4K without breaking a sweat, Broadwell-Y can’t. The hardware is there because the vanilla CPU and the -Y variant are the same silicon, the power limits what can be done with it a -Y SKU. In short the painfully low power limits of Broadwell-Y mean that it is hard to do sustained 4K on both an internal and external screen at once even if the silicon is more than capable when unconstrained.
Moving back to the core itself we have roughly 5% IPC improvement over Haswell. This comes from many smaller sources like a larger OoO scheduler, faster store-to-load forwarding, larger L2 TLBs, and a second TLB page miss handler for parallel page walks. Some common instructions have been sped up, FP multiplies drop from five to three cycles and several others ops have been sped up too. Not to be left out, VM enters and exits are now faster as well.
Intel did a lot of little tweaks to make the core faster, none of which were a big bang. One change for Broadwell is that the base metric for improvements was made more stringent. Instead of a 1:1 performance:power use minimum for performance gain feature, Broadwell required a 2:1 ratio for any new or improved functions. This is quite a high bar for microarchitectural changes but Intel did manage to find quite a few this time around.
That is actually quite a good summary of the changes to Broadwell-Y, no big bangs as you expect from a shrink but a lot of minor things that all add up. That list starts with the shrink, moves on to GPU improvements, then to packaging innovation and power management. In the end it looks a lot like a Haswell, just smaller and thinner. When the Broadwell-Y CPUs come out in the waning days of this year, we will finally see if the result lives up to the hype.S|A
Have you signed up for our newsletter yet?
Did you know that you can access all our past subscription-only articles with a simple Student Membership for 100 USD per year? If you want in-depth analysis and exclusive exclusives, we don’t make the news, we just report it so there is no guarantee when exclusives are added to the Professional level but that’s where you’ll find the deep dive analysis.
Latest posts by Charlie Demerjian (see all)
- More on Intel’s 10nm process problems - Sep 17, 2018
- Intel puts out another 14nm 2020 server platform - Sep 11, 2018
- Why Can’t Intel Supply Enough 14nm Xeons? - Sep 10, 2018
- Intel can’t supply 14nm Xeons, HPE directly recommends AMD Epyc - Sep 7, 2018
- AMD reintroduces the Athlon name with two CPUs - Sep 6, 2018