
Google DeepMind recently unveiled “Gemma 3,” an advanced AI model designed to operate efficiently on a single GPU or TPU. Notably, it has outperformed models such as Llama-405B, DeepSeek-V3, and o3-mini in preliminary human preference assessments on the LMArena benchmark. Additionally, Google emphasizes that Gemma 3 is optimized for high-performance AI applications on standard hardware configurations.
Gemma 3 supports over 140 languages and features an extended context window of up to 128K tokens. It also boasts cutting-edge text and visual reasoning capabilities and is available in four parameter configurations—1 billion, 4 billion, 12 billion, and 27 billion—allowing developers to select the model size that best suits their hardware and performance requirements.
The model is compatible with a wide array of development tools, including Hugging Face Transformers, Ollama, JAX, Keras, and PyTorch. Moreover, it has been optimized for NVIDIA GPUs, Google Cloud TPUs, and AMD GPUs. Developers can now access Gemma 3 through Google AI Studio, Kaggle, and Hugging Face’s hosting platforms.
Beyond Gemma 3, Google DeepMind has also introduced ShieldGemma 2, a 4-billion-parameter image safety screening tool capable of detecting and labeling potentially harmful, explicit, or violent content. This enables developers to create more responsible and secure AI-powered applications.
Additionally, Google has launched the “Gemma 3 Academic Program,” offering $10,000 worth of Google Cloud credits per recipient to support academic researchers in conducting studies based on Gemma 3.
As of now, the Gemma ecosystem has generated over 60,000 derivative AI models, while the Gemma series has surpassed 100 million total downloads, underscoring its widespread adoption and influence in the AI community.