Google has unveiled Trillium, its latest energy-efficient Cloud TPU designed to drive next-generation AI models. Trillium is already in use, powering Google’s advanced AI model Gemini 1.5 Pro. Presented at the I/O 2024 event, Trillium is the 6th generation Google Cloud TPU, tailored specifically for AI tasks, including the latest generative AI models such as Gemini 1.5 Flash, Imagen 3, and Gemma 2.0. Jeff Dean, Chief Scientist at Google DeepMind and Google Research, highlighted the significance of Trillium, stating that it significantly boosts performance and efficiency for training and inference, particularly for large-scale Gemini models. Compared to its predecessor, Trillium offers a substantial 4.7X increase in peak compute performance per chip, with enhanced memory and interconnect bandwidth. Trillium features a third-generation SparseCore accelerator, facilitating processing of large embeddings in advanced ranking and recommendation tasks. Google claims that Trillium enables faster training of AI models, reduces latency, and lowers costs. Additionally, it is touted as Google’s most eco-friendly TPU yet, boasting over 67% greater energy efficiency than its predecessor. Trillium can scale up to 256 TPUs in a single pod and supports multislice technology, enabling the creation of supercomputers capable of processing massive amounts of data. Google’s Trillium TPU represents a significant milestone in AI hardware evolution, supporting advanced AI workloads and collaborations with partners like Hugging Face for open-source model training and serving.

Google Introduces Trillium, its Highly Efficient Cloud TPU, for Advancing Next-Generation AI Models