Google Cloud is developing its most powerful supercomputer yet

Google has revealed its new A3 cloud supercomputer, which is available in private preview and is the tech giant’s most powerful supercomputer to date. The A3 is purpose-built for training and serving the most demanding artificial intelligence (AI) models that power generative AI and large language model innovation. This new powerhouse can be used to train Machine Learning (ML) models, furthering Google’s push to offer cloud infrastructure for AI purposes, such as the G2, the first cloud Virtual Machine (VM) to use the new NVIDIA L4 Tensor Core GPU.

The A3 uses the Nvidia H100 GPU, the successor to the popular A100 that powered the previous A2. It is also used to power ChatGPT, the AI writer that kickstarted the generative AI race when it launched in November last year. Additionally, the A3 is the first VM where GPUs will use Google’s custom-designed 200 Gbps VPUs, which allows for ten times the network bandwidth of the previous A2 VMs.

The A3 will also make use of Google’s Jupiter data centre, which can scale to tens of thousands of interconnected GPUs, and “allows for full-bandwidth reconfigurable optical links that can adjust the topology on demand.” Google claims that the workload bandwidth of the A3 is indistinguishable from more expensive off-the-shelf non-blocking network fabrics, resulting in a lower total cost of ownership. The A3 also provides up to 26 exaFlops of AI performance, considerably improving the time and costs for training large ML models.

Google also claims that the A3 achieves a 30x inference performance boost over the A2 when it comes to inference workloads, which is the real work that generative AI performs. In addition to the eight H100s with 3.6 TB/s bisectional bandwidth between them, the other standout specs of the A3 include the next-generation 4th Gen Intel Xeon Scalable processors, and 2TB of host memory via 4800 MHz DDR5 DIMMs.

The A3 can be deployed on the Google Kubernetes Engine (GKE) and Compute Engine, and customers can receive support on autoscaling and workload orchestration, as well as being entitled to automatic upgrades. Google’s B2B approach when it comes to AI is evident, as it seems to be offering powerful AI infrastructure to customers rather than unleashing an AI for anyone to play around with. Nonetheless, it also announced PaLM 2 at Google I/O, which is its successor and supposedly more powerful than other LLMs.