2 min Applications

DeepSeek introduces self-learning AI models

DeepSeek introduces self-learning AI models

DeepSeek is collaborating with Tsinghua University to reduce the training process of its artificial intelligence (AI) models, to reduce operational costs.

According to The Edge, the Chinese start-up that shook up the market in January with its cheap reasoning model is now working with researchers from the institution in Beijing on a paper that describes a new approach to reinforcement learning to make artificial intelligence models more efficient.

Reward for accurate answers

The researchers wrote that the new method is intended to make AI models more in line with human preferences by rewarding more accurate and comprehensible answers. Reinforcement learning has proven effective in accelerating AI tasks within limited application areas.

However, expanding it to more general applications appears to be a challenge in practice. That is the problem that the DeepSeek team is trying to solve with what they call self-principled critique tuning. According to the paper, this strategy performed better than existing methods and models on various benchmarks and led to better performance with less computing power.

DeepSeek calls these new models DeepSeek-GRM, which stands for generalist reward modeling. DeekSeek says it will make the models available on an open source basis. Other AI developers, including the Chinese technology company Alibaba Group Holding and the San Francisco-based OpenAI, are also focusing on this new frontier of reasoning ability and self-improvement of models while performing tasks in real time.

Mixture of Experts architecture

Last weekend, Meta Platforms released its newest family of AI models, Llama 4. The company noted that these are the first models to use a mixture of experts (MOE) architecture. DeepSeek’s models also extensively use MOE to use resources more efficiently. Meta compared its new release with the models of the Hangzhou start-up. DeepSeek has not yet indicated when it will release its next flagship model.