12 May 2020

Dawn of AMD

7 minutes read
Written by Martin Knakal

For the most part in the last decade and half Intel has run virtually unopposed in the field of server-side CPUs. This technological supremacy lasted until June 2017 when AMD released the first Epyc server line of processors. For the first time in decade and half AMD became competitive again and has been steadily gaining market share while Intel has mostly stagnated.

It’s been 3 years and we’ve seen AMD move from the 14nm fabrication process of Zen microarchitecture and 12nm for Zen+ respectively to Zen 2 manufactured on a 7nm process. Meanwhile Intel’s 10nm is still mostly non-existent.

Last year introduced Intel’s Cascade Lake refresh and AMD’s Rome architecture. Now after some time we have enough data to tell with confidence, which one really came out on top.

Hard stats comparison

By the end of the year Intel came with their own data, showing that Intel Xeon Platinum 9282 beats AMD’s Epyc 7742 in variety of HPC benchmarks by geometric mean of 34%. Whether or not it was a valid comparison, we leave up to a reader. Luckily there were many more, done by independent parties.

The most popular comparison since their launch in Q3 2019 was AMD EPYC 7742 vs. Intel 2S Xeon Platinum 8280. Both are general purpose high-end processors so pitting them against each other actually makes sense.

At first glance you would expect Intel to have better single-core performance and most benchmarks correspond to that expectation. Unfortunately that’s all the good news since EPYC 7742 dominates pretty much everywhere else. Interestingly enough the same results occur in comparisons throughout the product spectrum, meaning that even lower-tier CPUs retain the same relative differences.

Compression and encryption performance are some of the most interesting from the CDN point of view. Let’s have a quick look at some interesting benchmarks done by servethehome and phoronix.

Compression Performance

source: https://www.servethehome.com/wp-content/uploads/2019/10/AMD-EPYC-7742-7zip-Compression-benchmark.jpg

source: https://cdn.mos.cms.futurecdn.net/sppNvSHPS4j8MALRcmrRWm-650-80.png

OpenSSL Performance

source: https://www.servethehome.com/wp-content/uploads/2019/10/AMD-EPYC-7742-OpenSSL-Sign-benchmark.jpg

source: https://www.servethehome.com/wp-content/uploads/2019/10/AMD-EPYC-7742-OpenSSL-Verify-benchmark.jpg

source: https://cdn.mos.cms.futurecdn.net/reLy93NSCdwWEXgr7aRfaT-650-80.png

Almost all of the above show that 7002 Series is at least on par with Intel, but generally performs much better. Dual Intel Xeon Platinum 8280 however has almost double the TDP of of Epyc 7742. Now a quick google search will reveal those are not a quality measures for comparison, but such a huge difference stands out on its own widening the price gap between the two.

So what makes AMD superior at the moment?

There is very little doubt that AMD has the technological edge for now. However mere core count isn’t the only thing making it superior. Zen 2 architecture introduced some serious upgrades.

Cache design

High-end 7002 processors come with a huge 256 MiB L3 cache which is unprecedented in widely available CPUs. But the real magic happens when you take a closer look at memory design. While the previous generation had four NUMA nodes per socket, 7002 series introduced a single one.

source: https://www.servethehome.com/wp-content/uploads/2019/08/AMD-EPYC-7002-Architecture-NUMA-Reduction-to-104ns-Close-201ns-Far.jpg

Experiments done by CF show how dramatically improved L3 cache miss ratio directly translated to 30% better performance (adjusted to operating frequency).

Security

RAM encryption has been with us for a while. In 2014 Intel introduced Software Guard Extensions which allowed us to create encrypted “enclaves”. While better than no encryption at all, it came with severe limitations. You actually had to design applications around it and as a result could only run it on Intel processors. All of the encrypted memory also had to fit into a dedicated physical memory range called the enclave page cache (EPC), that is hard capped at 128MiB. No chance of full-memory encryption. That combined hindered wide deployment, not mentioning performance drop of up to 50%.

Since then AMD brought Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV). SME allows you to mark any pages to be encrypted using a 128-bit AES key randomly generated by hardware on every boot that cannot be accessed by software. Encrypted pages are denoted by C-Bit setting, but you can also choose to encrypt a whole memory.

The most important aspect is that all of it is done very easily by setting one bios and kernel support flag. In other words you are pretty much given an on/off switch on memory encryption.

SEV allows you to have one SME per virtual machine and each such VM controls its own encryption. Various benchmarks made by AMD or others show that performance drop inflicted by SME is more or less non-existent.

source: https://sviko.com/uploads/default/original/1X/60651f74e6328d8095245fe4315faf1d17ad6b0a.jpeg

Now does it make your data perfectly safe? Unfortunately still no. It’s been shown multiple times already that memory encryption won’t save you from various side-channel attacks. Fraunhofer Institute for Applied and Integrated Security published a paper on their designed SEVered attack, aming to point out vulnerabilities of AMDs solution. Still encryption goes a long way in mitigating other attacks.

Intel’s take on the issue called TME (Total Memory Encryption) and MKTME (Multi-Key Total Memory Encryption) respectively is still largely hypothetical and not available on any Intel processor at the moment. During RSA Conference 2020 in San Francisco Intel revealed plans to introduce full-memory encryption in upcoming CPUs. Whether it will be just catching up or a real innovation remains to be seen.

What future holds

From all the information above it may seem that Intel is done for. Intel’s own CFO George Davis said that their 10nm will be weak from financial point of view, which is unsurprising since it’s getting obsolete already and still yet to launch. However Intel is still in the game.

This year Ice Lake and Cooper Lake Xeon processors are expected to hit the market. Cooper Lake in particular brings bfloat16 support. If you’ve never heard of it, it has the same range as float32, but sacrifices some of its precision. Google’s experiments show that it provides significant performance boost for ML and AI applications without noticeable impact on model accuracy. While Cooper lake will only be available to “key partners”, wide bfloat16 support can be expected in the future. It might make Intel more attractive for AI and HPC segments. No official support from AMD has been confirmed, but is to be expected.

Recently leaked benchmark suggest that 12-core Ice Lake CPU performs almost as good as 24-core Xeon Gold 6226, which would translate to 100% performance improvement if you choose to believe it. Another impressive benchmark appeared on SiSoftware ranker. This time we got a bit more information. 14-core CPU with 28 threads features 21 MB of L3 cache and 17.5 MB of L2. The base clock speed is 2 GHz. Closest comparable match to it would be three years old Intel Xeon Gold 6132 and ranking-wise there is 54% improvement. At the moment it’s very difficult to say if Ice Lake processors live up to te expectations, but both show that AMD’s Zen 3-based Milan processors may have solid competition this year.