ARM is making IoT smart to save the net

Techcon 2016: Compute, data, and energy locality at once

ARM logoAt Techcon, ARM was talking about a sea change in computing from IoT. This is something SemiAccurate has been writing about for a while but it pervaded almost all talks given by the company.

This change is related to the concept of compute locality vs data locality, the long way of saying do you crunch the numbers where you gather them or at a central location. At the moment video is driving this trend but cloud based analytics are only the tip of the iceberg because they are so bandwidth intensive. Other services will do the same when there are millions or billions of devices deployed as well.

The problem is simple, how much does it cost to send bits to a cloud server to process, store, analyze, or all three. Bandwidth is cheap if you are a home user is a first world country, it is almost free on a per-MB level. Sensors, IoT data, and the rest get shuffled off to the cloud and stored on Facebook et al’s servers indefinitely.

That said even these megadatacenters have limits. Facebook, Youtube, and all the others have been compressing the heck out of video for a while, you might have noticed this. Even they can’t store the firehose of streams coming at them at full resolution. Sooner rather than later economic realities are going to force them to change how they operate and what they store. Recompression is only the beginning from their point of view.

For the home user, uploads are not really an issue, I’ll bet you don’t think about uploading a large video to the cloud from your home broadband connection. Doing the same with 4K videos on your phone might cause you to pause a bit, how many GB does your monthly LTE plan support? How many minutes of 4K video is that? This is the start of the problem, even at HEVC compression rates 4K video is economically non-viable over cell networks for the common cases.

When 4K video devices start to become commonplace in the home, security cameras, smart doorbells, baby monitors, and all the rest, even fast home connections will suffer. It doesn’t take many 4K video streams to hit the usually burred in the fine print broadband transfer limits from most providers. This is avoidable with a little bit of forethought but think about the next few steps in this IoT progression.

Currently most people reading this will have a few dozen connected devices in their home, if you count what you have from thermostats to PCs and TVs, the numbers add up quick. When the full IoT storm hits in the coming years, those dozens are going to move to hundreds or more. Every light bulb, electrical socket, appliance, toy, and gizmo will be connected. No single one will be a problem but by the hundreds, well data caps and usage will add up fast. 4K and higher video isn’t going away either and together they will add up to pain for broadband connections.

Once again this isn’t going to be a major problem for most home users, you can just buy more bandwidth. Then again a hefty monthly fee just to have usable bandwidth over the devices spewing reports in the background isn’t a really enticing option for most people. As mildly unpleasant as this may be, think about it from the network providers perspective.

These companies tend to oversell bandwidth to home users by about 30:1 and bank on users not knowing or caring that they are getting throttled by their neighbors. This is why during peak hours, say after work/evenings, your movies may crawl and buffer. That 30Mbps connection may seem like dialup in a busy neighborhood, and overselling is why. This works for sporadic consumer users but when you get to always on, constantly streaming IoT devices, the system breaks down.

That is of course the easy problem, wired infrastructure. You can just add more switches to the CO, more fiber to the backhaul, and so on. When it comes to cellular and over the air services, spectrum is a pretty hard limit. Things work pretty well now but when you start getting heavy use cases like a stadium event or worse yet a public disaster, the system tends to get taken to its knees or breaks down entirely. Things like LTE Broadcast help but can only do so much. Millions or billions of LTE based IoT devices will break the current system. 5G is being architected for this but there is only so much it can do, bandwidth is still finite.

So back to the original problem, data locality vs compute locality. ARM is quick to point out that there is a solution to the problem by doing some simple number crunching. Instead of streaming 4K security camera footage of a door for 23.8 hours a day and .2 hours of someone standing in front of it, why not only send the .2 hours? That requires some number crunching and usually specialized analytics software or better yet hardware. ARM’s Apical ISP is just that, it hardware AI/image recognition in a functional unit for low power devices.

Even without AI and facial recognition, you can easily do motion detection with low power hardware to only stream those .2 hours. This helps a lot but sooner or later that won’t be enough so the next step is to only stream the things you want to see. If it is a home security camera, you probably don’t care about yourself or your family walking in the door. You do care about people you don’t know walking in the door. With facial recognition you could cut down that data stream by a few more orders of magnitude.

That is the crux of the issue, instead of streaming a very high bit rate file all the time you can cut it down to almost nothing with a little number crunching. With a lot more compute you can take that number down quite a bit more. Suddenly the transmission and storage infrastructure that is already in place can cope. More importantly there are sane and affordable business models to provide all of this, it can be made to work on human budgets too.

All of this however requires a large leap of focused compute capabilities at the device node itself. Cameras will need image or facial recognition. Other sensors will need similar types of classification software or better yet hardware. Without it the data streams of essentially worthless information will overwhelm any infrastructure put in place. A day of 4K video will consume ~1TB of HDD space, there is no sane economic model to cope with that.

Luckily this change is underway as we speak. The analytics are being built into most near future devices. This will require a much more expensive device though, instead of a few dollars of silicon in a camera we are going to need low tens of dollars in chips per camera. This will in turn save much more than tens of dollars in bandwidth and storage costs, the ROI will be both very positive and very quick. If you can’t comprehend why it will be so when shopping for your next in-fridge 4K camera, think about how many businesses have a few dozen cameras at each location. It adds up.

To confound the issue even more is to get to the next stage of our conundrum, data locality vs compute locality with energy as an added factor. Since many IoT devices are in energy constrained environments be it battery, thermal limits, or just plain electricity costs, energy use is a first order problem now. Instead of just crunching the heck out of the video streams and sending only specific clips with people of interest, you have to ask if you can actually do that analysis on the end node.

Companies ahead of the curve like ARM’s Apical and Qualcomm with Zeroth are already providing solutions. Qualcomm’s Zeroth works with their ISP, DSP, GPU, and CPU to do all of this kind of analysis in the most power friendly way possible. It is available now in hardware sold by the millions, if you have a Snapdragon 820 device you probably use it without knowing it is there. ARM’s Apical hardware is still a ways out but it will be pretty widespread before long too.

This level of AI and analysis needs to come down farther and do so fast. For IoT to not completely break the transmission and storage infrastructure of the internet, there is no other choice. Luckily for us all, that was the main theme behind Techcon. IoT is changing from sensors with radios to sensors with radios and lots and lots of focused heavy compute. While some companies address this on paper with generic CPU power, that won’t fly because the new paradigm adds the modifier of “per Watt” or more to the point per Milliwatt. Generic compute for facial recognition is a non-starter in power constrained environments, you need bespoke hardware. Period.

Rather than the usual doom and gloom from potentially net-destroying infrastructure problems, this time SemiAccurate is going to end on a positive note. At Qualcomm’s 4G/5G summit two weeks ago, this problem seems to be well understood from the hardware and carrier side. You can buy the hardware from them now and 5G should make things a lot more efficient when it arrives. Last week’s ARM Techcon had the problem as it’s core theme, almost every talk was directly related to the problem. ARM has applicable bespoke hardware to license now as well so in a year or so the right pieces will be available to even the most budget conscious silicon maker. This doesn’t mean the problem is solved or there won’t be pain and expense, just that there is a potential solution coming before the bus runs off the cliff. Get working people, there is a lot to do.S|A

The following two tabs change content below.

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also a council member with Gerson Lehman Group. FullyAccurate