Intel launches Knights Landing Phi and other goodies

Supercomputing made into Legos, sort of

Jun 20, 2016 by Charlie Demerjian

Intel is talking about three things this ISC, Phi, SSF, and machine learning. In short it is about everything SemiAccurate expected them to talk about at the show and it all ties together.

Lets start out with the good stuff, the Knights Landing CPU aka Xeon Phi 72xx series CPUs. We told you about the architecture in-depth last year and the chips are finally out as a product. Take a Silvermont Atom, knead thoroughly, and add a sprinkle, that would be two, of 512b vector units. Multiply by 72 cores and bake in a 14nm fab until a golden-brown crust forms. Salt to taste. Optionally you can add whipped cream or an on-package Omni-Path adapter. SemiAccurate feels the whipped cream would look MUCH better than the ungainly Omni-Path adapter but Omni-Path definitely has higher bandwidth, see?

Does this remind you of anything clammy?

We will skip all the superlatives and performance claims and get right down to the heart of the value proposition, if a $2438-6254 CPU can be labeled ‘value’. The first Phi’s out on the market are going to be socketed rather than a PCIe card. They will boot a full consumer OS, in this case RHEL or SuSe Linux are blessed but others should work like they do on a normal x86 system. For the masochistic, Windows will run but it isn’t officially supported initially, expect it to get the blessing in a few months. The card variant will have an embedded Linux but that is more of a housekeeping OS, the host system will run the user facing OS.

All four SKUs for the new Phi

As you can see from the specs the price puts the two middle cards, the sweet spots for perf/watt and bandwidth, at the low-end of the large GPU energy draw spectrum and are comparable on price. Add in 15W for Omni-Path and you are still below the 300W usually sucked up by GPUs. Since these parts are a big vector engine with a little CPU bolted on, there won’t be much dark silicon here so Intel’s 14nm process is definitely a cut above the competition. For those into statistics, the Knight’s Landing die is 658mm^2 (20.853mm x 31.558mm) and has a bit more than 8B transistors.

Better yet a Phi will run bog-standard x86 code, no latency hit for PCIe traversal, no odd memory caps or mappings, 384GB DDR4 + 16GB of fast MCDRAM is a tad more than you can get on a modern GPU. By an order of magnitude or so, plus if you read the above link, the allocation is pretty flexible too. Couple this with modern GPU competitive performance of 6.09/3.05GF SP/DP and you have a pretty nice chip. The tools Intel provides for parallel programming, optimization, and the rest are top-notch too. In short Intel is offering a GPU performance part with an x86 learning curve, IE little to none.

For those wanting in on it you can either buy a 1S workstation/vertical server, from Intel or wait for system builders to start offering their wares. Intel is effectively sold out of the 7290 for the first few months, they should be back in stock by September. By then you should see a flood of other Phi/Knights Landing based systems on the market from a host of other vendors. This is partly because of part two of the story, SSF.

SSF stands for Scalable Systems Framework or an all encompassing hardware, design, and certification system for Intel products in the HPC/supercomputing market. Designing things well on this end of the market is not as easy as plugging in Ethernet ports and certification for various software packages is another painful slog. Intel is trying to take the pain out of the whole process by making and certifying all the building blocks possible beforehand. With luck a VAR will only have to pick the right blocks, build the system, and spend the time tuning and optimizing rather than reinventing the wheel and doing the paperwork to prove it. If this sounds a lot like the old Cluster Ready program, SSF has now subsumed and replaced that program and expanded its reach.

Last up is an interesting one, Artificial Intelligence. Intel has been rather quiet about this area while the GPU and ASIC guys get all the credit. That changes today with Intel shouting quite loudly about their AI program, both hardware and software. The first point they make is pretty interesting, most AI tends to run on one box because scaling it is really hard. This is why you see 8-GPU systems with as much memory as possible, a painfully expensive configuration vs 8x 1-GPU 1-CPU or 4x 2-GPU 1-CPU systems.

One of these things scales well

In case you missed the bit about Phi being bootable, having Omni-Path on package, and having 10x+ the memory of a GPU, these things matter a lot for AI. Intel’s chips provide better than GPU performance per Watt, don’t need an expensive CPU to host the code, and have more local memory space too. The most informative part about the scaling on AI workloads is the transition from 32 nodes to 128, Intel’s offerings can, GPUs possibly could but like Intel, I couldn’t find any public data on it. Now so you see why Nvidia is pushing NVLink?

Intel has all the relevant AI libraries optimized for their CPUs like they do with most software so the optimization of your cluster should be less about plumbing and more about useful things. Add portability between Phi and normal Xeons and you have a lot of headaches simply avoided. With the woes of Nvidia’s Pascal GP100 device being delayed until 2017, Phi is looking pretty good especially for greenfield builds.

If you haven’t got the big picture yet, let me fill in what Intel is trying to do in the HPC/supercomputer arena. Instead of selling CPUs or even servers, they are trying to push out complete solutions be it from a VAR or self-made clusters. To do this they are putting the pieces together into building blocks, offering tools, libraries, and optimized software, and pre-certifying the results as much as possible. This should seriously lower the cost of entry into the space and hopefully give the Intel based solutions a much better TCO than the competition. With the new offerings today, they look to have nearly all the pieces in place from silicon to VARs.S|A

Bio
Latest Posts

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate

Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate

Latest posts by Charlie Demerjian (see all)

What is Qualcomm’s Purwa/X Pro SoC? - Apr 19, 2024
Intel Announces their NXE: 5000 High NA EUV Tool - Apr 18, 2024
AMD outs MI300 plans… sort of - Apr 11, 2024
Qualcomm is planning a lot of Nuvia/X-Elite announcements - Mar 25, 2024
Why is there an Altera FPGA on QTS Birch Stream boards? - Mar 12, 2024

Thank you, Subscribers!

Thank you to our Subscribers, past and present.

You are appreciated.

You are what keeps SemiAccurate going, what allows us to maintain our journalism, what keeps us ad-free, what allows us to tell it like it is, it is still just you. You, the reader and subscriber, we thank you.

If you want to know more about subscriptions, both free and paid, the information can be found here.

For more on our track record of leading edge journalism see Fully Accurate.
Our Writers

Charlie Demerjian is the founder of Stone Arch Networking Services and S|A.

SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, security and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture.

As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends.

Thomas Ryan is a GIS Programmer and freelance technology writer from Seattle, WA. You can find his work on SemiAccurate and PCWorld.
Tweets from https://twitter.com/SemiAccurate/lists/writers

SemiAccurate

On Target Technology News

Hot Article AMD to differentiate cores

Hot Article Intel foundry customer bails out

Hot Article Coffee Lake is going to impact Intel’s margins

Hot Article SemiAccurate digs up Intel Coffee Lake specs

Intel launches Knights Landing Phi and other goodies

Supercomputing made into Legos, sort of

Charlie Demerjian

Latest posts by Charlie Demerjian (see all)