Sapphire Rapids 4th Gen Xeon Hands-On: Testing Intel's Bold Claims


Remote Sapphire Rapids Testing For Client/Server, And Our Key Take-Aways

cpu heatsink ram 2 intel sapphire rapids xeon 4th generation

Intel's Remote Environment

Intel provided us unfettered access to another Sapphire Rapids-based server with identical hardware to what we had on hand, but in a remote environment. We needed it along with a second system with a pair of 40-core 3rd Generation Scalable Xeon processors based on the Ice Lake architecture to act as the client. The two systems were joined by a pair of 100 Gigabit Ethernet connections to maximize the amount of available bandwidth so as to not present a bottleneck in the various network-related tasks. 

The first test is SPDK, a storage development kit for moving data from NVMe drives over TCP. There are three different sets of tech to perform the task of reducing latency while performing error detection on data stream transfers. Those three options are the out-of-the-box experience direct from SPDK, using Intel's Intelligent Storage Acceleration library (ISA-L, used earlier with QATzip), and the company's Data Stream Accelerator (DSA) that's part of QAT. We tested IOPS and latency across all three.

spdk kiops intel sapphire rapids 4th generation xeon highlighted

spdk latency intel sapphire rapids 4th generation xeon highlighted

There are two tests here: one with 16kB blocks and a queue depth of 256, and another with 128kB blocks and a queue depth of 64. As is the case with all storage benchmarks, smaller block sizes result in lower latencies and higher IOPs, but the two tests both demonstrate that DSA was able to both improve throughput and reduce latency. The larger blocks of the 128k QD64 test allowed for a larger jump in IOPs when using DSA compared to ISA-L. 

Next up is NGINX, which benefits from QAT in compression and cryptography acceleration. This test measures the number of secure connections a server can establish per second. The client sends a connection request without requesting a packet in response and uses key + certificate authentication, just like every secured website on the internet. Because no packets are requested, only the TLS handshake completes, over and over, as fast as the two machines can handle it. We test with the standard out-of-the-box experience, with QAT software (supported by Intel's oneAPI, just like QATzip) and with QAT hardware acceleration.

nginx tls intel sapphire rapids 4th generation xeon highlighted

What we see is not an increase in the number of connections that can be established securely, because that value (around 65k) doesn't change from run to run. In fact it doesn't even take all 120 cores to saturate the network. However, we do see with QAT that the number of cores required drops precipitously, one just by using oneAPI's math libraries, and again using hardware QAT. This again leaves many more CPU cores free to do other work. 

Last but not least, we have IPSec, the security protocol used by VPNs the world over. Like our NGINX test, this establishes secure connections, but this time they're fully formed connections with data exchanged between the client and server. And just like last time, we're limited by even a pair of 100 Gbit Ethernet cards, so we're measuring not how many connections can be established, but how many CPU cores are required to do it. No matter what, we are limited to 177Gbps of bandwidth.

ipsec encrypt intel sapphire rapids 4th generation xeon highlighted

When connecting to a VPN host using software-accelerated QAT via the oneAPI library, we're using just 6 cores, which is already pretty slim. However, we can free up even more resources with QAT hardware acceleration, as the active core count drops the required horsepower to just four of our 120 processor cores. 

We do feel pretty confident in these tests. While connected to the server and client, we were able to confirm that the hardware configuration is identical to what Intel shipped us, and we really just needed to SSH into this machine for the high-bandwidth network connection to a client and for the client's beefy configuration itself. We did, however, get to run these tests for ourselves and make our own observations. 

Key Take-Aways From Our Sapphire Rapids 4th Gen Xeon Testing

It's been a pretty exciting year in the consumer PC hardware space, but that doesn't mean data center server hardware has been sitting still. In fact, our direct testing of Intel's 4th Generation Xeon processors with Quick Assist Technology proved to us that there are still big gains to be had in the data center across a variety of very common workloads. It's valuable to verify that what we saw in person would carry over to our own environment with no special setup required. While we had to run our tests in something of a bubble due to early production Intel silicon, it's hard all the same to look at these numbers and not be impressed. Intel's 4th Gen Xeon system always hit is target claims and via software that we installed ourselves from scratch. There's no magic at work here, and that should set folks' minds at ease.

cpu in socket intel sapphire rapids xeon 4th generation

Intel's Quick Assist Technology acceleration that's part of Sapphire Rapids isn't exactly new, as Xeon Scalable Processors have had additional hardware to push specific workloads for a few generations now. There's a reason these processors are called 4th generation Xeons, however. It's interesting to see Intel target very specific server workloads that can dramatically improve CPU resource utilization and efficiency, freeing up cores and making them available to take on other tasks. Theoretically this can reduce the number of machines required to handle a workload or allow the server to scale in the number of tenants it can handle across workloads. We had 120 cores at our disposal, but some tests used as few as four of those cores to drive the QAT hardware acceleration while the rest were available for other requirements.

And of course the competition isn't sitting still. AMD just announced that it will live stream an event on November 10 to talk about its next generation of EPYC datacenter CPUs, codenamed Genoa. Presumably these processors will be based on the red team's latest Zen 4 architecture, potentially with X3D caches on board, if recent rumors turn out to be true. That means that once again, the server space is heating up. We don't know if Sapphire Rapids will officially launch by that date, but we do expect this fall to be a busy one in the datacenter, and Intel's 4th Gen Xeons are looking really good at this early stage.

Stay tuned to HotHardware for the latest in big iron heavy lifting as these great new server platforms officially roll out.

Related content