AMD Ryzen Threadripper Architecture Details
AMD Ryzen Threadripper Core Topology
We have already covered the AMD Zen microarchitecture and Ryzen-series processors in-depth, so we are not going to cover all of the finer details again here. If you’d like to learn more about the main features that make up AMD’s Zen microarchitecture, like SenseMI, Pure Power, Precision Boost, Extended Frequency Range (XFR), it’s Neural Net Prediction and Smart Prefetcher, we strongly recommend reading our initial Ryzen coverage, available here. Since Threadripper borrows a lot from EPYC as well, this deep dive on Naples and the initial EPYC series processor line-up is also recommended.
With that said, there are a few things we should point out. Though there are what appear to be four die under Threadripper’s heat-spreader, these processors feature only two functional dies, mounted catty-corner from one another. The other two pieces of silicon that look like dies are just bare silicon slugs that are there to mechanically balance the design.
A De-Lidded AMD Ryzen Threadripper Processor
The two dies on a Threadripper processor are connected to each other with AMD’s Infinity Fabric. Each 8-core die consists of two, quad-core compute complexes (CCX), that are linked by Infinity Fabric as well. Infinity Fabric is comprised of two key elements: a scalable control fabric and a scalable data fabric. The scalable control fabric has all of the central control elements, with small remote elements that are dispersed in each different block of the SoC. Feeding into the control elements across the fabric is the data from a myriad of sensors embedded across the SoC. The scalable data fabric is much like a high performance network pathway. It features a common bus, with low latency, and a coherent Hyper Transport Plus bus that’s multi-socket and multi die-ready. Infinity Fabric is the key element that AMD uses to link 8-core dies together on a single package.
Threadripper’s quad-channel memory configuration comes by way of the two die – each die features two memory channels. This type of memory configuration can affect performance and/or compatibility with certain application types (particularly games), which are not typically designed with distributed memory controllers in mind. AMD addresses this issue by allowing users to switch between Distributed (UMA – Uniform Memory Access) or Local (Non-Uniform Memory Access) modes via its Ryzen Master utility.
Distributed Mode places the system into a Uniform Memory Access (UMA) configuration, which prioritizes even distribution of memory transactions across all available memory channels. Distributing memory transactions in this way improves memory bandwidth and maximizes performance for applications that place a premium on raw bandwidth. This is the default configuration for the Threadripper processors.
Local Mode places the system into a Non-Uniform Memory Access (NUMA) configuration, which allows each die to prioritize transactions within the DIMMs that are physically nearest to the cores or cores processing the associated workload. Localizing memory content to the nearest cores improves latency for gaming applications that tend to place a premium on fast memory access.
AMD also allows Threadripper processors to be switched into what it calls a legacy compatibility mode. While qualifying Threadripper, AMD discovered a handful of games that would fail to launch when more than 20 logical cores were detected (Dirt Rally, Far Cry Primal, and Far Cry among them). Switching into legacy compatibility mode essentially disables one of the dies and turns the processor into an 8-core / 16-thread machine. AMD has tested over 60 PC gaming titles with legacy compatibility mode and provided the following data…
- Global average: Legacy Compatibility Mode provided a +4% uplift across 1080p/1440p/2160p.
- Global median: Legacy Compatibility Mode provided a +2% uplift across 1080p/1440p/2160p.
- Top 25% average: Legacy Compatibility Mode provided a +12% performance uplift across 1080p/1440p/2160p.
- Bottom 25% average: Legacy Compatibility Mode imparted a -5% performance decrease across 1080p/1440p/2160p.
Also note that switching into Legacy compatibility mode doesn't alter the memory or PCIe lane configuration. The non-core elements of the second die remain active -- it's only the CPU cores that get disabled.