Sandforce SSDs break TPC-C records

A look at the newest SSD controllers

Sandforce IconSSD CONTROLLERS USUALLY have one winner per generation, and it looks like this time Sandforce is the dominant player. The company released two new controllers today, and set a few records with them for good measure.

The latest Sandforce chips, the SF-1200 and SF-1500 are really ‘second generation’ controllers, with features that go far beyond simple data transfer. The higher end SF-1500 has a lot of enterprise features, while the SF-1200 is a subset of those aimed more at consumers. Both however have one major trick, they use MLC (Multi-Level Cell) flash to run at SLC (Single-Level Cell) drive speeds and reliability levels.

In the old ‘first generation’ controller days, SLC was the flash to choose if you wanted speed and reliability. SLC is typically rated at 100,000 writes, and had notably higher write speeds than MLC. MLC stores two bits per cell, so it has a massive density advantage over SLC. If you wanted capacity, you went MLC and gave up a little speed. A bit more troubling was the rated life, typically in the 10,000 write range, or 10 percent of a typical SLC cell. This kept MLC out of the enterprise space.

So, in comes Sandforce, promising speed, reliability and cost savings by using MLC where SLC previously reigned. How did they do it? Magic of course, especially if you consider proprietary algorithms that limit write amplification to be magic.

Sandforce block diagram

Block diagram of a SF-1×00

Magic would not be magic without a product name, and in this case, Sandforce calls it DuraClass. DuraClass consists of four components, DuraWrite, RAISE, performance and low power consumption.

DuraWrite is the bit that allows MLC flash to have have enough write lifetime to stand in for SLC flash. The idea is simple enough, reduce a phenomenon called write amplification by about 20x through actively looking at the data and intelligently dealing with it.

Flash must be written to in blocks and, depending on the architecture of the chip, it’s usually written about 4KB at a time. If you write 1 byte, the drive needs to write another 4095 bytes along with it to fill the block. If you modify a single bit in a file, the drive needs to read the whole block, modify it, and write it back to another location for wear leveling. Sandforce claims that the typical drive has 10-times write amplification, meaning that for every 1K it writes, it has to do an extra 10K writes to make that happen.

On the SF-1500, Sandforce uses a combination of data deduplication and compression to lower the write amplification level. Data deduplication is just what it sounds like. If a block is the same as another, it is not written, a pointer is just put in place to the other block. This can save a lot of writes, as does compression.

Sandforce claims they have 0.5-times write amplification on a typical workload, 20-times less than competitors. With 20-times less writes, a cell rated at 10K writes will theoretically live twice as long as an MLC cell with 100K writes and an ‘old school’ controller. How well this works in practice is one of those questions that can only be answered by long term testing, but IBM just set the world record TPC-C benchmark for 8-core machines using 56 177GB MLC drives with Sandforce controllers. If you need more reliability than a Big Blue Power 7 box set up for TPC-C, good luck finding it.

Somewhat related to the write reduction is compression, and the SF-1500 can transparently encrypt all data with AES-128 on the fly. Just set a BIOS password, and off you go, free security. All claimed performance numbers from the company are with encryption on, so it looks to work at true wire speed.

Speed is good, but in storage, reliability is more important, and that is what RAISE, or Redundant Array of Independent Silicon Elements, does. Think of it as RAID at the chip level. Sandforce implements the normal ECC (24b/512B sector) and SATA level error correction protocols, but that mainly prevents transfer errors. When a drive powers down, when the cache is cleared, it writes a bit to a location in the same way desktop file systems do.

If that bit is there when the drive is powered on, it goes about it’s merry way. If the bit isn’t there, there is likely a problem, and like desktop drives, it checks the disk for errors. RAISE comes in to play if there is an error, from premature shutdown or anything else. It is capable of restoring a sector (512B), block, or logical (4K) block in the same way RAID can restore a dead drive. Think of it as single drive RAID-5.

One other technology that Sandforce has is called SuperCAP, and it is what it sounds like, a capacitor that holds enough power to write out any buffers to flash memory upon power loss. Basically, if you are in the middle of writing a block and literally pull the plug, all data that has reached the drive cache will be written to flash. From there, RAISE should assure it comes back out.

That brings us to power savings. Sandforce claims .4W idle power and a max draw of 2.5W, far below both mechanical and SLC drives of similar speed. MLC flash can use less power than SLC, and since Sandforce doesn’t need complex DRAM caches to keep down write amplification, it saves a lot of power. It will be interesting to see how well those power numbers hold up against other second gen controllers when they hit the market.

So, how well do the controllers perform? The official claim is 30K IOPS, but a few vendors are saying their tuning and firmware will go quite a bit higher than that. Read speed is 270MBps and write speed is 263MBps, basically limited by SATA-II bus saturation. Some drive makers have SATA-3 controllers that prevent this bottleneck and allow for more than 300MBps reads. The one that is available now, Micron, has 350MBps reads, but the writes are much slower.

Sandforce performance

Mixed performance levels

On interesting bit is the performance across mixed workloads, Sandforce claims to be the fastest out there. Given the even read versus write performance on sequential tests, it isn’t all that much of a stretch to believe those numbers, so the drives based on the chip should be very ‘balanced’.

The main difference between the SF-1500 and SF-1200 is that the SF-1200 is capped at about 10K IOPS, so it will be notably slower than the SF-1500. Since it is aimed at consumer and notebook applications, it doesn’t have a lot of the data level ‘smart’ features, but most notebook users will never notice the difference.

There will also be some ‘uncapped’ SF-1200 chips out there for special edition drives. OCZ was showing some off at CeBIT, and others will likely follow. With that part, Sandforce looks to hit the sweet spot, speed and lower cost.

In the end, Sandforce looks to be the first out of the gate with a true ‘next generation’ SSD controller. If the numbers in the field hold up to the promises, and early results indicate that they do, it looks like the company has made something that scales from a consumer notebook to a fire-breathing IBM Power 7 box that sets TPC-C benchmark records. Not bad for the company’s first product line.S|A