Google Cloud is developing its most powerful supercomputer yet

Google also claims that the A3 achieves a 30x inference performance boost over the A2 when it comes to inference workloads, which is the real work that generative AI performs. In addition to the eight H100s with 3.6 TB/s bisectional bandwidth between them, the other standout specs of the A3 include the next-generation 4th Gen Intel Xeon Scalable processors, and 2TB of host memory via 4800 MHz DDR5 DIMMs.

The A3 can be deployed on the Google Kubernetes Engine (GKE) and Compute Engine, and customers can receive support on autoscaling and workload orchestration, as well as being entitled to automatic upgrades. Google’s B2B approach when it comes to AI is evident, as it seems to be offering powerful AI infrastructure to customers rather than unleashing an AI for anyone to play around with. Nonetheless, it also announced PaLM 2 at Google I/O, which is its successor and supposedly more powerful than other LLMs.