GeForce RTX 3090: NVIDIA's BFGPU Has Arrived And It Slays
Whatever you want to call the GeForce RTX 3090, one thing is for certain. As of this moment, the GeForce RTX 3090 is the single most powerful graphics card money can (almost) buy. It sits at the pinnacle of NVIDIA’s product stack currently, and according to the company, it enables things like smooth 8K gaming and seamless processing of massive content creation workloads, thanks in part to its 24GB of on-board GDDR6X memory.
A graphics card like the GeForce RTX 3090 isn’t for everyone, however. Though its asking price is about a $1,000 lower than its previous-gen, Turing-based Titan RTX counterpart, it is still out of reach for most users. And the GeForce RTX 3090’s performance characteristics will likely make its value proposition interesting to only a select group of enthusiasts and creators. We’ll do out best to better explain all of that on the pages ahead. For now, let’s take a look at the specs and inspect this big, beautiful beast...
GeForce RTX 3090, it is essentially the replacement for the previous-gen Titan RTX. As such, it isn’t purely a gaming-focused GPU. According to NVIDIA, demand for the various Titans was higher than anticipated, so with this generation, in addition to selling them directly, NVIDIA is working with board partners to go wider with availability, so they will be offering GeForce RTX 3090 series cards as well.
Before we dive any deeper into the speeds and feeds though, we need to direct your attention to a few previous articles. We have already covered much of the underlying technology at the heart of the GeForce RTX 3090, so we won’t be doing so again here. If you want some of the backstory, however, we recommend checking out our coverage of NVIDIA’s initial GeForce RTX 30 series announcement, the deeper dive on its new features and
NVIDIA GeForce RTX 3090 Speed And FeedsAs you can see in the detailed spec breakdown and comparison above, the new GA102-powered GeForce RTX 3090 is amped-up and more capable than the previous-gen Titan RTX in almost every way, except for two. The GeForce RTX 3090 has a lower default boost clock and fewer Tensor cores. The GA102’s newer architecture and additional resources more than compensate for the lower default boost frequency though, and Ampere’s 3rd-generation Tensor cores more than double the throughput of the previous-generation, in addition to supporting additional types of math, like BLfoat16 (BF16) and TensorFloat-32 (TF32). In regards to pixel and texture fillrate, memory bandwidth, and compute performance, the GeForce RTX 3090 is significantly more powerful than the Turing-based Titan RTX, or anything else for that matter.
As we’ve mentioned in our previous GeForce RTX 30 series and Ampere coverage, all of those additional transistors were used to enable new features, like PCIe Gen 4 support, and enhance Ampere’s performance for virtually all GPU-bound workloads. Pre-Turing, NVIDIA’s GPU architectures had only one data path, for example. A second one was added with Turing, though -- one for floating point and, one for integer. And with Ampere that second Integer path has been beefed up with an additional FP32 unit, so floating point heavy workloads have much more horsepower at their disposal.
The NVIDIA GA102’s SM (Streaming Multiprocessor) configuration has also been completely revamped. Ampere’s new SMs double the L1 bandwidth and cache partition size and add 33% more L1 capacity, for up to 10,496KB on the GeForce RTX 3090.
NVIDIA found that Turing often had good Bounding Box intersection rates, but Triangle Intersection rates were a limiting factor with some workloads, so Ampere got some attention in that regard as well. Ampere can now process Bounding Box and Triangle intersection rates in parallel to improve efficiency and performance, and thanks to the additional GPU resources available, Triangle Intersection rates are approximately twice as fast now too. A new Triangle Position Interpolation unit has also been added, which will enable more accurate motion blur effects in future RTX-enabled applications.
Bleeding-Edge Memory And Cooling TechLike the GeForce RTX 3080, the GeForce RTX 3090 is outfitted with Micron’s latest GDDR6X memory technology (the upcoming GeForce RTX 3070 will use standard GDDR6), which offers much higher bandwidth. GDDR6X leverages 4-level PAM4 signaling that can transmit twice as much data per clock, effectively doubling bandwidth per tick. The first wave of flagship Ampere-based GeForces will employ
The enhancements introduced with Ampere aren’t all about performance, though. NVIDIA also tweaked a few things to improve overall efficiency too. For example, with previous-gen architectures, NVIDIA had one power rail for both the GPU cores and memory controller. A single-rail design meant that if one resource wanted to operate at high voltage, the other had to as well. With Ampere, however, NVIDIA bifurcated the core and memory power rails into separate feeds, so they can operate independently. Dual power rails should allow for finer-grained control and energy savings, which ultimately means improved power and thermal characteristics.
The GeForce RTX 3090’s cooler is outfitted with dual axial fans, and a split heatsink design that is quieter that previous-gen solutions, while capable of dissipating up to 90 more watts of power. One end of the heatsink is attached to a vapor chamber, that’s mounted directly to the GPU and memory. The fan above that section directs air through the heatsink and immediately funnels it out of the chassis through large vents in the case bracket. The heatsink on the back half of the card, which is linked to the front vapor chamber via multiple heat-pipes, allows air from the second fan to pass all the way through, where it is rises to the top of the chassis and is eventually exhausted from a system, assuming it’s got decent ventilation.
on their cards.
Pop over to this URL if you want those deets.
Now let’s get to some numbers...