Intel Aurora Supercomputer Ready To Break Records With Nearly 64K GPUs Across 10K Blades
Now, witness the power of this fully armed and operational supercomputer. That's right: it's finally finished, and its 10,624 blades pack in a total of 21,248 Xeon Max CPUs and 63,744 Intel Data Center Max series GPUs. Intel and Argonne aren't talking in terms of core counts—likely because such numbers are largely meaningless at this scale—but if Aurora uses the top-end Xeon Max CPU SKU, this machine would have 1,189,888 CPU cores.
The supercomputer includes 10.9 PB (that's petabytes, or 10900 TB) of DDR5 memory, an additional 1.36 PB of HBM on CPUs, and 8.16 PB of HBM on GPUs. All of that RAM is fed by 220 PB of purely solid-state "Distributed Asynchronous Object Storage", which is actually down from the 230 PB number that Intel itself gave us last month in the slide above. DAOS leverages HPE's Slingshot high-performance fabric to connect the storage array to the rest of the machine at some 31 TB/second.
Intel crows that Aurora should be the fastest supercomputer in the world, at least by the measure of the TOP500 list. In fact, it expects that Aurora will be the first 2-exaflop system. That number is so large it's hard to even imagine, but it's two quintillion floating-point operations per second, or 2,000,000,000,000,000,000. It's two-thousand petaflops, or two million TFLOPS in slightly more familiar units. This comes after AMD stole the 1-exaflop landmark with Frontier late last year.
What exactly will all this power be used for? Well, you know, stuff. Jokes aside, Intel says that Aurora will be used for the usual supercomputer HPC stuff: fluids simulation, neuron and photon transport simulations, electronic structure calculations, fusion plasma simulations (as shown in the slide above), and of course, developing generative AI models.
Using Aurora, Intel, HPE, and Argonne are apparently working on a state-of-the-art generative AI model trained using "general text, scientific texts, scientific data, and code" with 1 trillion parameters. For perspective on that number, the popular Stable Diffusion only has 890 million parameters. Of course, number of parameters isn't everything; Stable Diffusion produces markedly better results than models like DALL-E 2 which came before, and which had over 3.5 billion parameters. Intel hopes the new model will be applicable to all types of research.
It’s stunning to see what the #Aurora Supercomputer is made of, including 63,744 Intel Data Center GPU Max Series and 21,248 #IntelXeon CPU Max Series processors. Today, Aurora is another step closer to powering the discoveries of tomorrow. Learn more. https://t.co/lJaENlMlT9 pic.twitter.com/D7ALzFiSuj
— Intel Graphics (@IntelGraphics) June 22, 2023