CPUs are winning the supercomputing race because backward compatibility now beats raw performance
GPUs get the glory in the AI boom, but CPUs still run 80% to 90% of high‑performance computing workloads. That isn’t a glitch in the hype cycle; it’s a structural fact about where power now lives in computing. In supercomputing, the most valuable resource is no longer floating‑point operations per second. It’s the accumulated mass of human code that “just works” on CPUs, and the financial pain of moving that code anywhere else.
The new MIT Technology Review Insights report on high‑performance CPUs, together with current vendor data, quietly makes this clear. CPUs are evolving-chiplets, on‑package memory, hybrid CPU‑GPU designs-but the real story isn’t speed. It’s that CPUs have become the default substrate for continuity: of software stacks, of business processes, of institutional memory. As long as rewriting code is harder than taping more silicon onto the package, backward compatibility will beat raw performance, and CPUs will remain the hidden governors of the AI era.
The evidence: the CPU as the “it‑just‑works” center of HPC
The report starts with a blunt empirical claim: CPUs still handle 80% to 90% of high‑performance computing workloads worldwide. These aren’t toy tasks; they include climate modeling, semiconductor design, and large‑scale scientific simulations. In other words, even in fields that aggressively chase performance, the vast majority of cycles still flow through CPUs.
This isn’t happening because GPUs are weak. The discourse around AI makes GPUs sound like the de facto replacement for general‑purpose CPUs. Yet the report frames GPUs as “workhorses” of the AI revolution, while immediately insisting that “the central processing unit (CPU) remains the backbone” of HPC. The backbone metaphor is revealing: GPUs and accelerators may be the muscle, but CPUs are the structure everything else is attached to.
Why? The report quotes Evan Burness of Microsoft Azure describing CPUs as the “it‑just‑works” technology. In practice, that’s about software, not silicon: organizations run complex, proprietary codebases that have been layered, patched, and tuned for CPU architectures over years or decades. Porting those systems to GPU‑first environments isn’t just a compilation exercise; it is costly, risky surgery on business‑critical logic.
The report explicitly contrasts this with GPU migration, calling moving complex, proprietary code to GPUs “an expensive and time‑consuming effort,” while CPUs typically maintain software continuity “across generations with minimal friction.” In other words, CPU vendors don’t need to beat GPUs on every benchmark; they only need to keep evolving in ways that avoid breaking the mountain of existing software that already runs on them.
Meanwhile, the competitive landscape for CPUs has diversified. What was once an Intel‑dominated x86 monoculture now includes:
- ARM‑based CPUs, like those powering the Fugaku supercomputer in Japan, showing that radically different instruction sets can still be pulled into the CPU continuity orbit.
- Emerging architectures such as RISC‑V, open and customizable, aiming to participate in the same general‑purpose role.
- Cloud provider silicon from Microsoft and AWS, with CPUs designed in‑house specifically to sit at the center of cloud HPC and AI offerings.
The research context reinforces this structural role. AMD’s EPYC 9005 (Zen 5) line reaches up to 192 cores per CPU while maintaining strong single‑threaded performance-an explicit nod to heterogeneous workloads that mix tightly parallel simulation with stubborn serial code paths that were never rewritten. Intel’s 4th‑gen Xeon family splits into high‑performance cores and energy‑efficient cores (Sierra Forest), signaling a similar reality: supercomputing is no longer a purist’s world of perfectly scalable kernels; it’s an unruly mix of old and new code running side by side.
Both vendors emphasize not just headline performance, but platform support: PCIe 5.0, DDR5, high‑bandwidth fabrics like Infinity Fabric and CXL. The CPUs are framed as hubs around which accelerators, memory, and storage orbit. Even when NVIDIA Grace, Tenstorrent, or Arm Neoverse chips enter the picture, they are described as “complementing” CPUs for AI‑heavy workloads, not replacing them as the primary execution substrate.

Crucially, the economic framing is about total cost of ownership (TCO) and change management. Next‑generation CPUs increase power density and demand new cooling, but those are treated as solvable infrastructure problems. The deeper cost flagged in the research is human: retraining staff, re‑architecting applications, managing software licensing models tied to CPU cores. The CPUs keep getting denser and more capable precisely to forestall the need for organizations to question these higher‑order dependencies.
Put differently: the ecosystem keeps bending silicon so that human code doesn’t have to move.
The mechanism: when code is capital, compatibility becomes a weapon
The structural pattern underlying all of this is simple: human code is now a form of capital, and that capital is overwhelmingly denominated in CPU terms. Once that’s true, the competition isn’t “GPU vs CPU” in a vacuum. It’s “how expensive is it to re‑denominate this capital into a new architecture?”
High‑performance workloads are rarely clean, modular, or easily portable. They entangle:
- Decades of domain‑specific algorithms (climate models, CFD solvers, financial risk engines)
- Opaque vendor libraries with CPU‑centric optimizations baked in
- Operational scripts, data pipelines, and monitoring tools that assume CPU behavior
- Core‑based licensing schemes that price software by CPU sockets and threads
This stack accretes around the CPU architecture of the day. Every patch, optimization, and bureaucratic process that touches it makes the stack heavier. Over time, the cost of moving that mass to a new substrate—GPUs, custom accelerators, even a different CPU ISA—rises nonlinearly.
Vendor incentives align perfectly with this inertia. AMD and Intel are not just selling chips; they are selling the promise that the mountain of CPU‑targeted code will continue to run with “minimal friction” on their next generation. The move to chiplet architectures, on‑package memory, and hybrid CPU‑GPU designs is a way of increasing performance without forcing major software rewrites. Instead of asking enterprises to meet new hardware halfway, CPUs contort themselves to keep the old abstractions viable.
Cloud providers amplify this dynamic. When Microsoft and AWS design their own CPUs, they aren’t simply optimizing flops per watt. They are hard‑coding their cloud platforms’ assumptions about instance types, pricing, and APIs into silicon. If a CPU is the “it‑just‑works” baseline for HPC, then a cloud‑native CPU is the “it‑just‑works here” anchor for a particular cloud’s ecosystem. Each provider’s CPU becomes a gravitational well for workloads, making exit and re‑platforming even more expensive.
On the other side of the table, organizations measure infrastructure choices in TCO terms: acquisition cost, power, cooling, software licenses, and change management. The research context underscores that next‑gen CPUs require expensive cooling upgrades and more sophisticated power distribution. But it also notes that these upfront costs can be offset by higher density, energy efficiency, and reduced server counts, compressing ROI timeframes to 12-18 months in HPC‑intensive fields like drug discovery or financial modeling.

That equation deliberately excludes the much larger, harder‑to‑quantify cost of organizational code migration. Rewriting, validating, and operationalizing critical HPC workloads on GPUs or non‑CPU accelerators threatens timelines, regulatory approvals, and institutional reputations. In sectors where a failed simulation can cascade into a failed product launch or a missed regulatory window, those non‑silicon risks dominate decision‑making.
The consequence is a form of soft lock‑in where CPUs maintain dominance not because they are always the best tools for a given numeric kernel, but because they sit at the intersection of everything that cannot fail: compliance, uptime, historical comparability of results, vendor contracts, and staff expertise. In this environment, backward compatibility is not a nice‑to‑have feature—it is a weapon that defends incumbents against more performant outsiders.
Even alternative CPU architectures like ARM and RISC‑V end up reinforcing this logic. Fugaku’s ARM‑based design shows that you can radically change ISA and still present something that behaves like a CPU from the software stack’s perspective. The ambition of RISC‑V and startups like Tenstorrent is not to displace the CPU category but to join it, offering CPUs that can tap into the same “general‑purpose, widely supported” narrative. In doing so, they validate the CPU as the primary unit of continuity—even as they challenge who profits from it.
The implications: HPC becomes an optimization of the past, not a reinvention of the future
If backward compatibility remains the dominant organizing principle, several patterns follow.
First, the frontier of “next‑generation” supercomputing shifts from architectural daring to architectural accommodation. The most celebrated advances in CPUs—192‑core EPYC parts, Intel’s mixes of performance and efficiency cores, elaborate chiplet topologies—are fundamentally exercises in making the old model scale. They stretch the single‑node CPU abstraction as far as possible so that software written in a CPU‑centric era doesn’t have to confront its own limits.
This doesn’t mean innovation stops; it means the direction of innovation is biased. Design effort goes into how many cores, how to feed them with on‑package memory, how to attach accelerators via fabric standards like CXL—all to protect the illusion of a big, fast, coherent CPU with familiar semantics. Alternative paradigms that might demand different programming models or mental models—dataflow architectures, radically asynchronous systems, domain‑specific accelerators—remain secondary, precisely because they require humans to think differently.
Second, cloud providers gain disproportionate control over how “general‑purpose” computing evolves. When a hyperscaler ships its own CPUs, it can tune the boundary between what runs on CPUs, what runs on GPUs, and what gets pushed to specialized NPUs or offload engines. The division of labor between these components, once set, starts to define best practices, tooling, and third‑party software integration.
In this world, the CPU is not just a chip; it’s a policy surface. Decisions about instruction sets, cache hierarchies, and accelerators become decisions about which workloads are easy and which are expensive. Organizations building on top of these CPUs inherit those priorities without ever seeing them explicitly framed as political choices.

Third, human technical work inside enterprises tilts further toward maintenance and tuning rather than invention. The research context talks about “IT staff training,” “application optimization,” and managing thermal and networking upgrades. The skills being cultivated are how to adapt infrastructure to new CPU generations, how to squeeze more out of current code, how to recompile and retune. There is far less structural incentive to reimagine workloads from the ground up for a GPU‑first or accelerator‑first world.
The result is a kind of local maximum: organizations keep making their CPU‑anchored systems better, cheaper, and more efficient, while whole categories of potentially transformative architectures remain underexplored because they require stepping off that hill. The hardware is capable of more radical rearrangements, but the software—and the people who own it—are locked into marginal improvement loops.
Finally, “performance” itself gets redefined to match what CPUs are good at. If 80-90% of HPC workloads are CPU‑bound, benchmarks, funding, and success metrics drift toward what CPUs excel at: mixed workloads with significant serial components, large memory footprints, and complex control logic. GPU‑friendly formulations that might demand algorithmic change are penalized not just technically but organizationally: they look risky, expensive, and misaligned with the TCO models everyone uses.
Over time, this can subtly constrain the kinds of scientific and AI questions that get asked at scale. Problems that fit the CPU‑centric mold are easier to justify and operationalize; those that would benefit from radically different architectures have to fight upstream against the compatibility machine.
The stakes: who gets to change the code, and who is reduced to serving it
When CPUs win by weaponizing backward compatibility, the center of power in computing shifts away from those who design the fastest hardware or the cleverest algorithms. It moves toward whoever controls the legacy: the codebases, the contracts, the cloud platforms, the norms of “it‑just‑works.”
For human agency, that means more people in HPC and AI will find their work defined by constraints they didn’t choose. They will be asked to tune, port, and maintain within an architecture that exists to protect accumulated software capital, not to maximize what is mathematically or scientifically possible. Identity shifts from “builder of new systems” to “caretaker of systems that cannot be allowed to break.”
The deeper question is whether we accept that the path of next‑generation supercomputing is constrained by what our old CPU‑bound code can tolerate. If the answer remains yes, then the most profound decisions about AI and HPC won’t be made in research labs or GPU roadmaps. They will be made wherever someone decides that supporting one more generation of “it‑just‑works” CPUs is cheaper than asking humans to change how they think and write code.



