2 min Analytics

OpenAI launches GPT-4.1 models, with a focus on coding

OpenAI launches GPT-4.1 models, with a focus on coding

OpenAI is introducing three new models in the API: GPT-4.1, GPT-4.1 mini and GPT-4.1 nano. These models perform better than GPT-4o in all areas, especially in programming and following instructions. They support up to 1 million tokens.

The launch of GPT-4.1 does not come as a surprise. Earlier this week, it was leaked that OpenAI is working on a successor to GPT-4o. Interestingly, these new models are not limited to one variant, but OpenAI has opted for a whole family of models with different capacities and cost levels.

Leaps in programming skill

For developers, the improved coding capacity is a particularly important step forward. GPT-4.1 scores 54.6% on SWE-bench Verified, a significant improvement of 21.4% compared to GPT-4o. People validate this benchmark. It shows the extent to which the model can solve software problems. Combining better coding with following instructions makes these models particularly suitable for building autonomous agents.

What further distinguishes GPT-4.1 is its improved handling of long context. The model can now process up to 1 million tokens and scores 72% on Video-MME, a benchmark for multimodal understanding of long context. That is 6.7% better than what GPT-4o could achieve.

Mini and Nano: smaller models, big impact

In addition to the flagship GPT-4.1, OpenAI also introduces GPT-4.1 mini and GPT-4.1 nano. GPT-4.1 mini outperforms GPT-4o on various benchmarks while being almost half as fast and costing 83% less. GPT-4.1 nano is the fastest and cheapest model, ideal for classification or auto-completion.

These smaller variants retain an impressive context window of 1 million tokens despite their size. GPT-4.1 nano scores even higher than GPT-4o mini on various tests. This means that powerful solutions are now available even for applications that require low latency.

It is worth noting that GPT-4.1 will only be available via the API, not in ChatGPT. For users of the chat service, the improvements will be gradually implemented in the existing GPT-4o version.

For API developers who work with large files, GPT-4.1 is more reliable with code diffs in different formats. The output limit has also been increased from 16,384 tokens to 32,768 tokens, allowing larger files to be processed at once.