Gemini 1.5 Pro, Google’s most capable generative AI model, is now available in public preview on Vertex AI, Google’s enterprise-focused AI development platform. The company announced the news during its annual Cloud Next conference, which is taking place in Las Vegas this week. Gemini 1.5 Pro launched in February, joining Google’s Gemini family of generative AI models. Undoubtedly its headlining feature is the amount of context that it can process: between 128,000 tokens to up to 1 million tokens, where “tokens” refers to subdivided bits of raw data (like the syllables “fan,” “tas” and “tic” in the word “fantastic”). One million tokens is equivalent to around 700,000 words or around 30,000 lines of code. It’s about four times the amount of data that Anthropic’s flagship model, Claude 3, can take as input and about eight times as high as OpenAI’s GPT-4 Turbo max context. A model’s context, or context window, refers to the initial set of data (e.g. text) the model considers before generating output (e.g. additional text). A simple question — “Who won the 2020 U.S. presidential election?” — can serve as context, as can a movie script, email, essay or e-book. Models with small context windows tend to “forget” the content of even very recent conversations, leading them to veer off topic. This isn’t necessarily so with models with large contexts. And, as an added upside, large-context models can better grasp the narrative flow of data they take in, generate contextually richer responses and reduce the need for fine-tuning and factual grounding — hypothetically, at least. So what specifically can one do with a 1 million-token context window? Lots of things, Google promises, like analyzing a code library, “reasoning across” lengthy documents and holding long conversations with a chatbot. Because Gemini 1.5 Pro is multilingual — and multimodal in the sense that it’s able to understand images and videos and, as of Tuesday, audio streams in addition to text — the model can also analyze and compare content in media like TV shows, movies, radio broadcasts, conference call recordings and more across different languages. One million tokens translates to about an hour of video or around 11 hours of audio.
About OODA Analyst
OODA is comprised of a unique team of international experts capable of providing advanced intelligence and analysis, strategy and planning support, risk and threat management, training, decision support, crisis response, and security services to global corporations and governments.