Start your day with intelligence. Get The OODA Daily Pulse.
Nvidia Corp. today launched a powerful reasoning artificial intelligence model that unifies text, vision and speech, capable of acting as the “brains” of faster, smarter agentic AI applications. Dubbed Nemotron 3 Nano Omni, and weighing in at about 30 billion parameters, the new state-of-the-art model uses mixture-of-experts architecture to deliver extremely low latency and provides high flexibility and control. Nvidia combined vision and audio encoders with its 30B-AD3B hybrid MoE architecture to eliminate the need for separate perception modules, allowing its AI model to unify everything into one. The company said this allowed the model to improve efficiency at scale and provide up to nine times faster throughput than other open omni models on the market. “To build useful agents, you can’t wait seconds for a model to interpret a screen,” said Gautier Cloix, chief executive of H Company. “By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn’t practical before.”