A long look at the Intel Cascade Lake 9200 line

Is there a point other than PR?

Xeon Bronze logoYou may have noticed SemiAccurate glossed over Intel’s new 9200 series of Cascade Lake CPUs. We did this for a reason, they aren’t real products, just a PR stunt meant to stave off embarrassment at the hands of AMD’s Rome.

Yes we said it, Cascade Lake-AP is a PR stunt and an expensive one at that. There is no point in spending the money to engineer, validate, design systems for, and bring this turkey to market other than for Intel to claim they aren’t being beaten like a drum by AMD’s Rome. The problem is simple, AMD’s Epyc based on Rome will have 64 cores vs Intel’s best Xeon 8280 at 28 cores. This isn’t to say the 8280 is a bad part, we really don’t think it is, just that Rome is in a different league. AMD out-engineered Intel, period, and will have a 50%+ lead in per-socket performance in a few weeks.

So when you are behind, what do you do? Spend money on a PR boondoggle is not the correct answer. Guess what? Intel resurrected the dual-die Cascade Lake-AP line to ‘counter’ the AMD threat. We say resurrected because when SemiAccurate exclusively brought you the news of the -AP line last July, we were fairly surprised. Why? Because we had heard about it over a year before and were told it was killed because everyone involved realized it was a bad idea being done for all the wrong reasons. Customers were said to have given similarly positive feedback as well. Intel listened and did the right thing for the right reasons, Cascade-AP, Sky-AP at the time from what we are told, was killed.

Then Rome. Panic. Response. Lots and lots of money spent. The end solution was so popular there is literally no OEM/ODM willing to build boards and systems for this beast. If it had a coat of paint and the FPGA socket extension it could pass for a white elephant. Actually you can buy an elephant for quite a bit less. If you are thinking that SemiAccurate believes there will be no good that will come of the Xeon 9200 line, you are a bit off. Lets look at what it is and what it is good for.

Cascade Lake SKU Stack

Eyestrain for four SKUs

There are four SKUs of Cascade-AP, the 32-core 9221 and 9222 at 250W, the 48-core 9242 at 350W, and the 56-core 9282 at 400W. As an interesting side note, three months before launch no OEM/ODM SemiAccurate talked to had heard of the 9282, a 400W TDP, or a 56-core part at all. SemiAccurate’s educated guess is that someone at Intel got a hold of a Rome and benchmarked it…..

Intel Cascade-AP 9200 2U air cooled chassis

Air cooled 2U chassis

In an unusual step Intel will be making the systems for the 9200 line, it will not be sold as a bare CPU, it won’t even be socketed per se, just systems in five configurations. The base chassis is called the FC2000 and it will support up to four sleds that come in 1/2 RU widths at 1U or 2U heights. There are also 1U and 2U full width sleds but we don’t expect to see them in the wild. The base configuration is closed loop water cooling but the 250W and 350W 2U versions can be air cooled. Yes this is a density play and 2S 4-die systems in 1/2U is a pretty good density play.

Intel Cascade-AP 9200 die layout

1S or 2S?

Logically speaking however the system is a bit of a head scratcher. If you look at a node diagram you can see the two die are connected in a package the same way a 4S system would be connected electrically, and nothing is gained or lost. The claimed ’12 DIMMs’ is the exact same thing as you would get from 2x 8280s but you do lose lots of PCIe lanes. Technically the 9200 line supports 80 lanes, down from the 96 you should get with a 2S Cascade-SP system or the 128 per 1S Rome, but the S9200WK only supports 2x PCIe3 16x slots per socket. In a 2U form factor, halve that if you are in a 1U sled. Did we mention Rome is PCIe4? Guess we didn’t.

So by going with the 9200 you effectively lose 8 lanes per die but in practice the form factor, remember Intel is the only one making servers, means you have two slots per sled. Unless you want to go with the 2U form factor which blows the main premise of the 9200 line, density, out the window. In short if you pull the diagram above into GIMP and do a flood fill on the dark blue box outside the white Die boxes, then replace the 12 DIMMs bit with 2x 6 DIMMs, you have a 4S 8180 system.

Actually the 9200 line doesn’t support large memory configs like -M and -L SKUs, Speed Step Technology, or Xpoint DIMMs, but that is beside the point. Did we mention how the core count is really close to Rome? I just want to be there when an Intel sales person tries to pitch the 9200 to a customer and see how they explain that all the new killer features of the x2xx line aren’t present but that really isn’t a problem because… Someone please record the pitch and send it to me.

Intel 9200 bottom with pins

Easy to wire up

Intel 9200 PCB side shot

Cheap to make too

Then there is power to consider, and to be fair Intel did an amazing job putting 800W into a 1U half-width sled. Water cooling is problematic, costly, and usually shunned by data center managers averse to wetness in high energy density racks. That said Intel can probably convince them that it won’t be a problem and offer a reliable solution that has amazing density. If you look at air-cooling the numbers aren’t so good, you are effectively at a 4S 1U system which are not exactly uncommon. So win #1 is that the 9200 line can be more dense than off the shelf air cooled systems if you are willing to sacrifice PCIe slots. Then again…..

The second win is that logically the two die in the 9200 line map to the system as one 32-56C device. Why is this important? Some software is licensed on a per-socket basis so with a 9200 system you can get 2x the cores for the price of one license. We won’t point out that vanishingly few packages are still licensed this way, most are per core or per AEU (Arbitrary Extortion Unit), but that number is not zero. If you pay for one of these packages the 9200 line will be a pretty amazing deal until the vendor realizes what is going on and changes the terms to reflect this loophole. Or Rome comes out. Either way there is a three month or so window where the 9200 has a second killer app.

So those are the high points, are there any low points? Sure. Lets start out with cost. Intel flat out refuses to give SemiAccurate a price for the 9200 line because, well the reasons vary. It won’t be sold as a chip, closer to launch, and many others have been heard, but SemiAccurate thinks it is just embarrassment at the magnitude of the number. If you think about it an 8280 costs $10,009 and the 9282 is 2x of those in a custom low volume package, socket, and system. If you price it at <2x 8280 you run the risk of people buying the 9282 instead. 2x 20-core 6248s will only run $6144 and getting to 48 cores via the 8268 will cost a mere $12,604. How much do you value density again? Any questions about the lack of pricing?

Intel 9200 Cascade Lake vs AMD Rome demo

One very nicely done demo

The saddest part about the 9200 line is that Intel did one of the most brilliant competitive demos SemiAccurate has seen in years, then ruined it. They competitively benchmarked a 7nm Rome against a 350W, 48C 9200, probably a 9242 but it wasn’t specifically called out. Since Intel doesn’t have a Rome, *COUGH*, how could they do this? They took the video of the Rome demo from CES that AMD put up on YouTube and ran their benchamark with the same program against the video. And guess what? The 48-core Intel 9200 won by a bit under 10%.

If you assume the 56-core 9282 is about 15% faster, that would give the full version roughly a 26% advantage over Rome assuming perfect scaling. (Note: The math is (56/48)*10.65/9.86) Lets just call it 25% faster and assume that the AMD CPU didn’t get any faster between the demo and release six months later. (Note 2: If you assume a little less than 10% gain for AMD you will be in the right ballpark but don’t tell anyone we said that) Even if AMD gets faster, Intel will still win with the 9200 line on this workload. It was a brilliant hack of a demo, well done whomever came up with that one, really, we were impressed with your ingenuity.

So why did we say Intel ruined it? Because we asked about price. No comment. The current high end Epyc 7601 costs $4200, the current high end Cascade-SP is $10,009. Since pricing for Rome or the 9200 haven’t been disclosed, lets make a few assumptions. Lets start off with the 9282 and assume it will be ~2x the 8280 or about $20K. AMD wants to gain marketshare and will likely not price Rome at 2x the Naples based 7601, lets guess at a 50% price hike there or about $6500. Even if AMD doubles their MSRP it will still be within the off the cuff statements of some friends at AMD who claim you will be able to get 2x Romes for less than 1x 9200.

Back to the ruining part the first prong is that Intel can win with a specialized part in a specialized socket that comes with handcuffs against a device that costs far less than half as much but is ‘saddled’ with a choice of commodity form factors. Intel still wins on memory slots per U but is that a market worth chasing? Instead of putting out pricing and saying, “Sure we cost more but we are worth it to several lucrative segments”, they instead put up a flashing red light and a loud siren while shouting, “we can’t compete against Rome in 99.97832% of the market”. It was honestly one of the coolest hacks of a demo we have seen in years and instead of victory it just exacerbated Intel’s pain points.

Then there is power. Rome will have a TDP of ~240W, up from the 180W of the Epyc 7601. This seems to be a fair trade for a claimed 2-4x performance increase. The 9282 is a 400W part that clocks 100/200Mhz base/turbo slower than the 205W 8280. If you take the power budget for 2x 8280s… So a 9282 is a wash for TDP against Intel parts but ~25% faster than Rome at ~160W or 1.67x the energy use. If you are in a power constrained environment or rack, the choice is obvious. If you care about TCO or just running costs, the choice is also obvious even before you consider MSRP. While the TDP disparity can be explained away by itself, in light of the price disclosure shenanigans it was just a cherry on top of a painful pie. Did we mention we thought the demo was really cool? SIGH.

So in the end what do we have? A desperate PR attempt so Intel can claim they are not woefully behind on core counts or performance. They engineered a bespoke package, socket, and system that no one wanted at a high cost to claim >28 cores but still lose against Rome at 64C. High density and software licenses are a clear win for a very small market, but they are a win as long as you don’t need PCIe lanes or any of the new killer features Cascade Lake brings to the table. And don’t compare it to custom water cooled Rome systems.

Instead of a win, Cascade Lake-AP is just a reminder that Intel is behind at everything this generation, or will be in a little over a month. SemiAccurate doesn’t understand why anyone would stand up and shout reminders of this fact to anyone listening, we would do the opposite. The Xeon 9200 line is a bad idea for all the wrong reasons, it offers a little good with a whole lot of down sides. It never should have been brought to market.S|A

The following two tabs change content below.

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate