Nvidia announces the arrival of Rubin, a new generation of AI chips and the successor of the still-fresh Blackwell GPUs. Rubin is said to surpass Blackwell in terms of performance and energy efficiency. Would this be another resounding success for Nvidia, or will that become more difficult now that the competition intensifies?
The novelty is not yet off the Blackwell chips. The March announcement is still fresh in everyone’s mind, and production has only just started. However, Nividia has not been dwelling on it for long. The chip manufacturer is already moving ahead with announcing a new generation. However, patience is still needed, as the new generation will launch in 2026.
Jensen Huang, CEO of Nvidia, presented the roadmap of new Rubin chips during Computex on Monday. Regarding specifications, Huang only shared that the GPUs will support High Bandwidth Memory 4 (HBM 4). That’s up to a maximum of eight HMB4 stacks. This memory standard is also expected by 2026. Blackwell chips currently in production support eight HMB3e stacks.
Also read: Nvidia solidifies AI lead at GTC 2024 with Blackwell GPUs
Blackwell refresh in 2025
Before that happens, another more powerful version of the Blackwell GPU will hit the market. This Blackwell Ultra refresh will support 12 HMB3e stacks. Nvidia is keeping up the same release cycle for Rubin. The GPU will also see the successor Rubin Ultra being in development in 2027, with support for twelve HBM4 stacks.
Finally, the Rubin GPUs will still have to serve as the successor to the Grace Hopper super chip. This will be the Vera Rubin super chip, combining two Rubin GPUs with a Vera processor. The processor, Vera, will be the successor to the Grace Arm CPUs.
Competition sharpens knives
Rubin and Vera are bound to be deployed in many data centres starting in 2025. That is safe to say, as Nvidia dominates the market for AI hardware for data centres. That is cautiously changing now that an initial collaboration between Nvidia competitors is aimed at NVLink Interconnect technology. AMD, Broadcom, Cisco, Google, HPE, Intel, Meta and Microsoft plan to develop a new industry standard later this year to combine the strengths of various servers. This appears to be a prelude to further attempts to end the Nvidia stranglehold on AI.
AMD is opening an even further attack on Nvidia during Computex. That starts with the announcement of the MI325X processor. This is an accelerator that data centres can deploy and combine. Its production will start this year. This GPU supports the memory standard HBM3e, just like Nvidia’s Blackwell chips. The current generation of MI300X GPUs still works with HBM3 and, therefore, has 30 percent less bandwidth. The memory of the MI325X accelerator also gets a big expansion to 288 GB. This is crucial for feeding large AI models to the GPU as quickly and efficiently as possible.
Also read: AMD introduces Instinct MI300X and MI300A for data centers
Next year, Su says it will release the new MI350 chip line, which will be based on the CDNA 4 architecture. That should yield hefty performance gains. AMD speaks of 35 times (!) better performance regarding AI inference. This reference to the time it takes to generate an AI response shows how AMD wants to address the AI market. The roadmap explained online by Brad McCredie, corporate vice president, Data Center Accelerated Compute at AMD, is also brimming with AI: “With our updated annual cadence of products, we are relentless in our pace of innovation and provide the leadership capabilities and performance that the AI industry and our customers expect as we drive the next evolution of data centre AI training and inference.”
Still planned for 2026 is the MI400 series. That will be built on an architecture called ‘Next’.
Faster cadence
For both Nvidia and AMD, the cadence of AI-GPU releases is being increased from biennial to annual. For developments as rapid as AI, it is necessary to step up a gear to avoid falling behind. Companies are motivated to make this happen. How this will turn out practically, however, remains to be seen. For example, the question of whether chip manufacturers can deliver a significant performance jump for each release remains to be considered. Only then will new hardware remain financially interesting enough for buyers.