IBM introduces z17 mainframe for on-premises AI

IBM introduces z17 mainframe for on-premises AI

IBM has announced the z17, which is equipped with the Telum II processor and the Spyre Accelerator. These IBM components allow running generative AI models and agentic AI on-premises. Data must be accessible at the lowest possible latency.

The new IBM z17 is designed to process the most critical transactions. Big Blue claims that approximately 70 percent of all financial transactions worldwide run on IBM mainframes. With the introduction of the Telum II processor and the Spyre Accelerator, IBM has made significant progress in improving AI capabilities for enterprise organizations.

The Spyre Accelerator, available in the fourth quarter of 2025, results from years of development by IBM Research. This 32-core accelerator is an optional PCIe card, with multiple cards that can be added as needed. The accelerator builds on the original Telum chip in the z16 systems.

In early tests, a Spyre prototype processed more than three times as many images per second per watt of electricity as high-end GPUs. This is a significant advancement, given the enormous energy requirements of AI workloads.

On-chip AI acceleration

The Telum II processor and the Spyre Accelerator form the heart of the z17 system. The processor contains a built-in AI accelerator core, comparable to its predecessor in the z16, but with improved performance.

“We built a complete accelerator,” says Jeff Burns, director of IBM Research AI Hardware Center. “It’s a system-on-chip chip, and a PCIe card, and a compiler, and a runtime, and a device driver — and so on.” These functions allow data scientists to use Spyre without special modifications.

Designed for future workloads

A major challenge in designing AI chips is the timeline. Workloads change quickly, but developing chips takes years. IBM has tackled this problem by focusing on watsonx, IBM’s AI platform, as a guide. The watsonx AI roadmap, developed years ago, predicted that by 2025, specifically designed hardware would help generative AI scale in new ways.

The Spyre accelerator is optimized for generative and agentic AI, rather than for models that are becoming less relevant in the sector, such as classification models. This makes this technology ready for future developments in AI.

Tip: IBM marries AI platform with as much data as possible: what’s Watsonx?