Start your day with intelligence. Get The OODA Daily Pulse.

Home > Briefs > Technology > Google ships its most expressive Gemini 3.1 text-to-speech model yet with 70+ language support

Google ships its most expressive Gemini 3.1 text-to-speech model yet with 70+ language support

Google is rolling out its new text-to-speech model based on Gemini 3.1 Flash. The company says it’s the most natural and expressive voice output it has shipped to date. The big new feature is audio tags—simple text commands that let developers control the style, tempo, tone, and accent of the generated speech. The model supports over 70 languages and can handle multi-speaker dialogs. On the Artificial Analysis ranking list, the model hits an Elo rating of 1,211 and stands out for its quality-to-price ratio. It beats Elevenlabs v3 in overall quality and sits just behind Inworld 1.5 Max.

Full report : Google rolls out Gemini 3.1 Flash TTS, a text-to-speech model supporting 70+ languages and audio tags that let developers control style, tempo, tone, and accent.

Tagged: Gemini AI Google