Spiced up AI: Google introduces Gemini GenAI model for highly complex tasks
New Delhi, Dec 6 : Google on Wednesday spiced up the generative AI race with introducing Gemini, its most capable and general model yet with state-of-the-art performance across many leading benchmarks, in three iterations.
The first version, Gemini 1.0, is optimised for different sizes: Ultra, Pro and Nano.
Google’s AI chatbot Bard will use a fine-tuned version of Gemini Pro for more advanced reasoning, planning, understanding and more.
It will be available in English in more than 170 countries and territories, and “we plan to expand to different modalities and support new languages and locations in the near future,” said Google.
The company also brought Gemini to Pixel 8 Pro, powering new features like Summarise in the Recorder app and rolling out in Smart Reply in Gboard, starting with WhatsApp, with more messaging apps coming next year.
In the coming months, Gemini will be available in more of Google products and services like Search, Ads, Chrome and Duet AI.
“These are the first models of the Gemini era and the first realisation of the vision we had when we formed Google DeepMind earlier this year,” said Alphabet and Google CEO Sundar Pichai,.
Gemini is the result of large-scale collaborative efforts by teams across Google, including the colleagues at Google Research.
“It was built from the ground up to be multimodal, which means it can generalise and seamlessly understand, operate across and combine different types of information including text, code, audio, image and video,” said Demis Hassabis, CEO and Co-Founder of Google DeepMind.
While Gemini Ultra is the largest and most capable model for highly complex tasks, Gemini Pro is the model for scaling across a wide range of tasks and Gemini Nano is for on-device tasks.
“With a score of 90 per cent, Gemini Ultra is the first model to outperform human experts on MMLU (massive multitask language understanding), which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities,” said Google.
From natural image, audio and video understanding to mathematical reasoning, Gemini Ultra’s performance exceeds current state-of-the-art results on 30 of the 32 widely-used academic benchmarks used in large language model (LLM) research and development.
According to Google, Gemini 1.0’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information.
“Our first version of Gemini can understand, explain and generate high-quality code in the world’s most popular programming languages, like Python, Java, C++, and Go,” said the company.
Gemini can also be used as the engine for more advanced coding systems.
The company trained Gemini 1.0 at scale on its AI-optimised infrastructure using Google’s in-house designed Tensor Processing Units (TPUs) v4 and v5e.
“Today, we’re announcing the most powerful, efficient and scalable TPU system to date, Cloud TPU v5p, designed for training cutting-edge AI models,” said Google.