Linux Driver Patch Reveals L4 Cache For Intel 14th-Gen Meteor Lake Processors

meteorlakestandarddensity
Graphics processors need a lot of memory bandwidth for good performance. This is why discrete GPUs have hot-clocked GDDR or HBM on massively-wide memory interfaces, but integrated graphics don't have that luxury. Instead, they have to share a relatively-pokey (in terms of bandwidth) main memory interface with the CPU cores.

Traditionally, the integrated graphics on Intel's CPUs have been able to make use of the CPU's last-level cache (LLC) to help with the most latency- and bandwidth-sensitive graphics operations. That's not going to be possible—or at least practical—with Intel's next-generation processors because of their tiled nature. The CPU L3 cache will reside on the compute tile, and to access it, the GPU would have to cross the base tile.

Not having any sort of cache at all would be miserable for the GPU's performance. Even discrete GPUs have considerable caches for rapid data mangling. While the Arc Alchemist-based GPU tile for Meteor Lake surely has some L1 cache on-die, it'll need a larger cache for good performance. Fortunately, thanks to a leak from Intel's Fei Yang, we know that the Blue team's 14th-gen CPUs will include a L4 cache for this reason.
fei yang mtl l4 cache

Replying to a post on the intel-gfx mailing list, Yang remarked that the "GT"—that refers to the integrated graphics on Intel—"can no longer allocate on LLC - only the CPU can." He goes on to say that this change, "along with addition of support for ADM/L4 cache" requires updates to the Linux graphics driver.

We have no idea what "ADM" stands for; some reasonable guesses are "All Device Memory" or "Arc Discrete Module". "L4 cache" is clear enough, though. This leak comes to us by way of Japanese-language hardware blog Coelacanth's Dream, where the Coelacanth ponders the function and location of the L4 cache. He mentions the last time Intel's processors had an L4 cache, which was with the Crystal Well eDRAM package on certain older CPUs.

ponte vecchio breakout
Ponte Vecchio has a large 144MB cache on its base tile.

However, while that was an L4 cache, it was a "memory-side" or "system-level" cache, meaning that it was used to cache any type of memory access by any part of the SoC. He reckons that the L4 cache on Meteor Lake is more of a traditional last-level cache for the CPU and GPU cores, and that it will be found in the processor's "Base Tile" upon which the functional compute elements rest. That's based on the fact that the same configuration is found in Intel's Ponte Vecchio datacenter GPUs.

It will be interesting to see whether these conjectures are accurate, as well as whether Intel will use the size of the L4 cache as a means of segmenting Meteor Lake processors. It will also be fascinating to see what effects it has on processor performance, again both CPU and GPU. You can bet that we'll be looking into this once we have 14th-gen silicon in hand.