Nvidia has unveiled a new strategy for lowering the energy consumption of data centres processing enormous volumes of data or training AI models: liquid-cooled graphics cards. The business revealed at Computex that it will be releasing a liquid-cooled version of its A100 compute card, which will consume 30% less power than the air-cooled version. Nvidia also promises that this isn’t a one-time event and that more liquid-cooled server cards are on the way, as well as hints at transferring the technology to other applications such as in-car systems that need to stay cool in small areas. Of all, Tesla’s recent chip recall demonstrates how difficult this may be, even with liquid cooling.
According to Nvidia, lowering the energy required to execute complicated computations could have a significant impact – the company claims that data centres consume more than 1% of global electricity, with cooling accounting for 40% of that. Reducing it by nearly a third would be significant, however, it is worth noting that graphics cards are only one component of the equation; CPUs, storage, and networking equipment also consume power and require cooling. According to Nvidia, GPU-accelerated systems with liquid cooling would be significantly more efficient than CPU-only servers for AI and other high-performance activities.
According to Asetek, a leading maker of water cooling systems, there’s a reason liquid cooling is popular in high-performance use cases ranging from supercomputers to specialised gaming PCs and even a few phones: liquids absorb heat better than air. And, once you have warm liquid, it’s pretty simple to move it somewhere else to cool down, as opposed to trying to cool down the air in an entire building or increasing airflow to the exact components on a card that are leaking out all the heat.
Aside from being more energy-efficient, liquid-cooled cards offer another advantage over air-cooled counterparts: they take up substantially less space, allowing you to fit more of them in the same amount of space.
Nvidia’s drive to reduce energy consumption through liquid cooling comes at a time when many businesses are concerned about the amount of energy their servers consume. While data centres are far from the main source of carbon emissions and pollution for big tech, they are an important piece of the issue, and critics have argued that offsetting energy use with credits isn’t as effective as cutting consumption entirely. In order to utilise less electricity and water, companies such as Microsoft have experimented with totally burying servers under liquid and even putting entire data centres in the ocean.
Of course, such options are quite exotic – while the type of liquid-cooling Nvidia’s giving isn’t necessarily the norm for data centres, it’s not as far-fetched as putting your servers in the ocean (though Microsoft’s efforts with that have been surprisingly successful so far). Nvidia specifically markets their liquid-cooled GPUs as “mainstream” servers rather than cutting-edge solutions.
This begs the issue of whether Nvidia would try to make liquid-cooling even more mainstream by incorporating it into the reference designs for its gaming-focused GPUs. The business makes no mention of such intentions, merely stating that it will “enable liquid cooling in our high-performance data centre GPUs” in the “foreseeable future.”
However, server technology is always trickling down to home PC innovation, and gaming cards with an all-in-one liquid cooler aren’t completely unheard of — AMD has had a few reference designs that incorporated a liquid-cooling loop, and third parties have previously produced liquid-cooled Nvidia cards. As Nvidia’s cards use more and more power (a stock 3090 Ti may draw up to 450 watts), I wouldn’t be surprised if Nvidia announced an RTX 5000-series card with a liquid cooler.
According to Nvidia, firms such as ASRock, Asus, and Supermicro will include liquid-cooled cards into their servers “later this year,” while slot-in PCIe A100 cards will be available in Q3 of this year. A liquid-cooled PCIe variant of the recently announced H100 card (the next-generation version of the A100) is scheduled for “early 2023.”