Start your day with intelligence. Get The OODA Daily Pulse.

Subscribe Sign In

Home > Briefs > Global Risk > DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises

DeepSeek unveils new AI reasoning method as anticipation for its next-gen model rises

04/07/2025

Chinese artificial intelligence (AI) start-up DeepSeek has introduced a novel approach to improving the reasoning capabilities of large language models (LLMs), as the public awaits the release of the company’s next-generation model. In collaboration with researchers from Tsinghua University, DeepSeek developed a technique that combines methods referred to as generative reward modelling (GRM) and self-principled critique tuning, according to a paper published on Friday. The dual approach aims to enable LLMs to deliver better and faster results to general queries. The resulting DeepSeek-GRM models outperformed existing methods, having “achieved competitive performance” with strong public reward models, the researchers wrote. Reward modelling is a process that guides an LLM towards human preferences. DeepSeek intended to make the GRM models open source, according to the researchers, but they did not give a timeline. The academic paper, published on the online scientific paper repository arXiv, comes amid speculation about the start-up’s next move following the global attention garnered by the firm’s V3 foundation model and R1 reasoning model.

Full report : DeepSeek and Tsinghua University researchers detail an approach combining reasoning methods to let LLMs deliver better and faster results to general queries.

Tagged: AI DeepSeek Large Language Model U.S. vs China

Subscribe Sign In

Related Posts