AMD’s Naples put them back in the server game

On top of Intel’s best at the moment

AMD Ryzen logoAMD is trickling out Naples information now, and from the look of things they are back in the datacenter now too. SemiAccurate thinks the benchmarks shown were a best case for AMD but it still doesn’t take away from the fact that Naples is quite competitive.

When we first spilled the details of Naples last June, there was one thing still up in the air, does the 4-die MCM use a soutbridge or is it a stand-alone SoC. The answer is it is a stand-alone SoC and all the goodies come from the MCM package unlike the desktop variant. This fully qualifies as ‘neato’ in our book but it does have one major caveat as the below picture points out.

AMD Naples 2S system diagram

128 lanes either way

As you can see each socket is connected to the next via 64 PCIe3 lanes. The Infinity Fabric tag is important too, that is the protocol that AMD chips now speak, basically Hypertransport+. What is the caveat? If you recall the base Naples die has 128 PCIe3 lanes of which 64 are lost when in a 2S configuration, but even there Naples has far more I/O than an equivalent Intel system. If you go back to our Naples reveal there was a tag of ‘up to 32 SATA’ lanes and the same ‘up to’ for the 16 10GbE lanes.

This is likely because those PCIe lanes are multi-purpose and can be re-purposed for SATA or 10GbE usage. This is a good thing in that you don’t need to buy external controllers for a server chassis or fill slots with NICs or controller cards. The down side is that you lose PCIe lanes either way, it is extremely unlikely you can do all 128 PCIe3 lanes as slots and still have more than a boot drive without impinging on those lanes.

That said this is almost a non-issue for buyers, SemiAccurate feels that most Naples buyers will be using NVMe drives unless they are doing cold storage work where vast I/O bandwidths are not important. In short we feel that even with this unmentioned trade, AMD’s claim of balanced I/O is a very fair claim and makes Intel’s parsimony on the same front a glaring weakness in some lucrative markets. For anything needing GPU compute, AMD’s Naples is the clear winner.

Naples vs Broadwell system specs

System to system configs

Compared to Intel’s latest Broadwell-EP, Skylake-EP is technically not out yet even if Intel is destroying customer relationship with it already, AMD’s Naples is ahead on most feature counts. AMD put out one Naples benchmark at their Ryzen preview, a seismic data test against the Intel 2S E5-2699A system shown above. There were three tests, one with 44 cores utilized on each system and memory locked at 1866MHz each, that is the highest frequency Broadwell-EP supports. The second had the full 64 cores running on AMD and memory at the 2400MHz Naples supports, and the third used 4 billion sample grid instead of the 1 billion used on the first two tests.

The results were that AMD trounced Intel, 18 seconds for Naples versus 35 for Broadwell-EP in the first, and 14 to 35 in the second. The third test, while valid we will ignore because Intel couldn’t load the dataset and crashed, Naples completed it in 54 seconds, near linear scaling.

What does this tell us? It says that the test is memory bound and AMD has twice the memory channels as Broadwell-EP per socket. It also says AMD’s socket to socket connection is up to the task at hand, at least for this type of test. When cores and memory speed scale up, the results scale almost exactly with memory bandwidth. For those pointing to the delayed again Skylake-EP/Purley having more channels, six vs four on Broadwell-EP, AMD’s advantage in HPC type applications will still be there. If Intel takes any comfort from Skylake-EP, it will be that AMD’s HPC advantage is halved by the new silicon in this lucrative market.

That leaves us with two key questions about Naples, core performance and power use. If you look at Thomas Ryan’s Ryzen review some of those questions can be answered. At higher clocks AMD’s Zen core is very close to Broadwell and a bit weaker than Skylake. This gap is very likely to be obliterated when you consider the differential in core counts, 32 for Naples vs 22 for Broadwell-EP and 28 for Skylake-EP. On a per-socket performance basis, AMD is right where it needs to be, on top of Intel or a bit better.

The last key metric is power use and again AMD is in a good spot. Ryzen has an official TDP of 95W vs Broadwell-E’s 145W. AMD typically overshoots their TDP a little, Intel undershoots so the gap isn’t 50% but there is still a gap. Multiply this by the core counts and the energy use gap should diminish again, lets ballpark it at about even on a per-socket basis. While we don’t know the frequency/voltage scaling curves for Naples yet, it looks good so far on paper.

So where does that leave us with for Naples? On a critical HPC workload, albeit a memory bound one, AMD trounces Intel’s best and should trounce Intel’s upcoming best. On a socket to socket performance basis, AMD looks to beat Broadwell-EP and roughly tie Skylake-EP’s performance but this will probably swing wildly with differing workloads. On the energy use front things are a little murkier but AMD looks to be very close to Intel with Naples, time will tell. The last metric is price, something AMD can both substantially undercut Intel with and make healthy margins in the process. Things are looking very good for AMD at the moment in servers.S|A

The following two tabs change content below.

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate