Next-gen multimodal AI for real-time agentic experiences with 1M-token context
Gemini 2.0 is Google’s flagship AI model designed for the "agentic era," enabling AI agents to perform multi-step tasks autonomously under human supervision. It processes text, audio, images, and video natively, supports 1M-token context windows (equivalent to ~700,000 words), and introduces multimodal outputs (text, images, audio) and native tool use (e.g., Google Search, code execution). The model outperforms predecessors like Gemini 1.5 Pro in coding (92.9% on Natural2Code) and math (89.7% on MATH benchmarks) while being twice as fast