- AiNews.com
- Posts
- Google Cloud Next 2025: Ironwood TPU and Gemini 2.5 Lead AI Push
Google Cloud Next 2025: Ironwood TPU and Gemini 2.5 Lead AI Push

Image Source: ChatGPT-4o
Google Cloud Next 2025: Ironwood TPU and Gemini 2.5 Lead AI Push
At Google Cloud Next 2025 in Las Vegas, Google announced sweeping advancements across AI chips, infrastructure, and developer tools—headlined by the launch of Ironwood, its most powerful and energy-efficient Tensor Processing Unit (TPU) to date. The announcements mark a pivotal step into what Google is calling the “age of inference”—where AI moves from reactive tools to proactive, reasoning-based agents.
CEO Sundar Pichai highlighted the company’s two-decade investment in AI, emphasizing its central role in Google’s mission to organize and deliver information at a global scale that's accessible and useful. This year’s updates continue that trajectory, offering businesses advanced AI infrastructure and models that are ready to deploy today.
Ironwood: Google’s TPU for the Age of Inference
With the launch of Ironwood, Google is ushering in what it calls the “age of inference”—a shift from reactive AI models that simply respond to prompts, to proactive systems that reason, retrieve, and generate insights on their own. In this new phase, AI agents don't just deliver data for humans to interpret—they interpret it themselves, delivering actionable, context-aware answers collaboratively.
Ironwood is the hardware backbone built for this future. As Google’s seventh-generation TPU, it’s purpose-built to power the computational demands of modern thinking models—from large language models (LLMs) and mixture-of-experts (MoEs) to complex recommendation systems and real-time agents.
Key breakthroughs include:
Up to 9,216 chips per pod, delivering a staggering 42.5 exaflops of compute—24x more than the world’s largest current supercomputer
Enhanced Inter-Chip Interconnect (ICI) bandwidth at 1.2 Tbps bidirectional, enabling efficient distributed inference with faster communication between chips
High Bandwidth Memory (HBM) capacity of 192 GB per chip and 7.2 TBps bandwidth—both significantly upgraded from the previous generation. This enables fast access to data, which is essential for handling today’s memory-intensive AI workloads.
A specialized SparseCore accelerator to handle ultra-large embeddings used in ranking, financial, and scientific workloads. This allows for a broader range of workloads, extending beyond traditional AI into areas like finance and scientific research.
2x better performance-per-watt than the previous TPU (Trillium), driven by liquid cooling and architectural efficiency
Ironwood also scales with Google’s Pathways runtime, allowing developers to harness the power of hundreds of thousands of TPUs for dense AI workloads—all integrated into the AI Hypercomputer architecture on Google Cloud.
Gemini 2.5 and Flash: Smarter, Cost-Effective AI Reasoning
Google reaffirmed its leadership in model performance with Gemini 2.5, its most advanced reasoning model, now available via Google AI Studio, Vertex AI, and the Gemini app. Gemini 2.5 has topped global benchmarks, including Chatbot Arena and the notoriously challenging “Humanity’s Last Exam.”
To support developers and businesses looking for more scalable solutions, Google also announced Gemini 2.5 Flash—a lighter-weight model optimized for low-latency, cost-sensitive inference. Flash lets users dynamically control reasoning depth, balancing performance and budget with greater flexibility. Gemini 2.5 Flash will be available soon in Google AI Studio, Vertex AI, and the Gemini app, with more details on its capabilities and performance to follow.
Cloud WAN: Opening Google’s Private Network to the World
Another major infrastructure announcement was the launch of Cloud Wide Area Network (Cloud WAN)—Google’s global private backbone, now available to enterprise customers.
Used internally to support services like Gmail, Search, and Gemini, the network spans over 2 million miles of fiber across 200+ countries, offering:
40% faster application performance
Up to 40% lower total cost of ownership
Near-zero latency and planet-scale reliability
Companies like Nestlé and Citadel Securities are already leveraging Cloud WAN for high-performance workloads. WAN will become generally available later this month.
AI for Products, Platforms, and Agents
Google emphasized that AI isn’t just powering infrastructure—it’s embedded across the company’s products and platforms:
All 15 half-billion-user products (including 7 with over 2 billion users) now use Gemini models
Lyria Joins Vertex AI - With the integration of Lyria, Google Cloud becomes the only platform offering generative AI capabilities across video, image, speech, and music—enabling creative and enterprise teams to build richer, multimodal experiences with a unified toolset.
NotebookLM and Veo 2 offer advanced multimodal and video generation capabilities, now used by 100,000+ businesses and major creative studios. AI powers their ability to understand long context, generate content, and present information in dynamic, highly visual formats.
Google Workspace now delivers new updates and tools to over 2 billion AI-powered suggestions each month—helping users write, summarize, organize, and collaborate across Docs, Sheets, Meet, and more.
On the developer side, Google announced:
Agent Development Kit (ADK) – An open-source framework for building custom AI agents
Agent2Agent (A2A) – A new interoperability protocol so agents can collaborate across frameworks and vendors
Agentspace and AI Agent Marketplace – Agentspace provides tools and infrastructure for customers to create, manage, and deploy AI agents across Google Cloud. The AI Agent Marketplace, on the other hand, is a curated platform where businesses can browse, buy, and integrate prebuilt AI agents from Google partners. Together, they streamline the discovery, development, and adoption of AI agents at scale.
Google Unified Security - Google Unified Security brings together Google’s best-in-class solutions for threat intelligence, cloud and enterprise security, and secure browsing into a single, AI-powered platform—designed to simplify operations and strengthen defenses across the enterprise.
Quantum and Scientific AI
Google also highlighted progress in quantum computing. Its Willow chip recently achieved a key milestone in quantum error correction, solving a decades-long challenge. This paves the way for scalable quantum systems, which Google sees as foundational for future breakthroughs in science, discovery, and AI.
Projects like AlphaFold (protein folding) and WeatherNext (advanced forecasting) continue to push the boundaries of what Google’s AI stack can achieve across disciplines.
What This Means
Google’s Cloud Next 2025 announcements represent a comprehensive realignment of the AI stack—from hardware and infrastructure to models and deployment tools. The unveiling of Ironwood signals a new era of inferential computing, where models don't just respond to queries but reason, adapt, and generate insights autonomously.
For businesses, this means faster, more cost-effective access to powerful AI—whether training large models, deploying lightweight agents, or transforming productivity tools. With Flash, Gemini, Cloud WAN, and TPUs like Ironwood, Google is positioning itself not only as an AI model leader, but as a full-stack provider for enterprise-grade intelligence.
As AI becomes central to how businesses operate, Google Cloud’s integrated ecosystem—models, chips, networks, and agent platforms—offers a one-stop environment to build, scale, and innovate at the edge of what’s possible.
Editor’s Note: This article was created by Alicia Shapiro, CMO of AiNews.com, with writing, image, and idea-generation support from ChatGPT, an AI assistant. However, the final perspective and editorial choices are solely Alicia Shapiro’s. Special thanks to ChatGPT for assistance with research and editorial support in crafting this article.