Start your day with intelligence. Get The OODA Daily Pulse.
U.S. tech giants are facing a reckoning from the East. Even as Nvidia pledged today to invest a staggering $100 billion into its own customer OpenAI’s data centers — a move that raised eyebrows across tech and business spheres — Chinese search giant Alibaba’s Qwen team of AI researchers debuted what may be its most impressive model yet: Qwen3-Omni, an open source large language model (LLM) that the company bills as the first “natively end-to-end omni-modal AI unifying text, image, audio & video in one model.” To be clear: Qwen3-Omni can accept and analyze inputs of text, image, audio and video from a user, but it only outputs text and audio — still a very impressive feat. Of course, OpenAI’s GPT-4o started the trend of “omni” models when it debuted back in 2024, but that model only unified text, image, and audio. Google’s Gemini 2.5 Pro from March 2025 can also analyze video, but, like OpenAI’s GPT-4o, it is proprietary (“closed source”), meaning you have to pay to use it, unlike Qwen3-Omni, which can be downloaded, modified, and deployed for free under an enterprise-friendly Apache 2.0 license — even for commercial applications.