In the past, when NVIDIA’s creator and CEO Jensen Huang spoke lyrical about artificial intelligence, it generally came across as marketing hyperbole — the lofty rhetoric we’ve come to expect from an executive with an endless supply of leather jackets. NVIDIA’s AI push, however, finally appears to be going somewhere this year, following the buzz surrounding OpenAI’s ChatGPT, Microsoft’s updated Bing, and a plethora of other competitors.
The business has used its GTC (GPU Technology Conference) as a platform to advertise its gear for the AI industry. It’s almost a celebration right now of how well-positioned NVIDIA is to seize this opportunity.
During his GTC address this morning, Huang stated, “We are at the iPhone moment for AI.” He was keen to note out NVIDIA’s contribution to the emergence of this AI wave: in 2016, he personally delivered a DGX AI supercomputer to OpenAI, which later served as the foundation for ChatGPT. The DGX systems have evolved over time, but many businesses haven’t been able to afford them (the DGX A100 sold in 2020 for $200,000, which was half the cost of its predecessor!). What about the rest of the world?
This is where NVIDIA’s brand-new DGX Cloud, a (obviously) online method of using the power of its AI supercomputers, comes into play. It’s designed to be a more adaptable solution for businesses to scale up their AI needs, starting at just $36,999 a month for a single node. Since they are all managed by NVIDIA’s Base Command software, DGX Cloud may also cooperate with on-site DGX devices.
Every DGX Cloud instance, according to NVIDIA, is powered by eight of its H100 or A100 systems, each with 60GB of VRAM, increasing the node’s total memory capacity to 640GB. As you might assume, there is high-performance storage in addition to low-latency fabric connecting the systems. With so much capability, existing DGX clients would find the cloud option more alluring—why spend another $200,000 on a box when you can accomplish so much more for a lower monthly cost? Oracle’s Cloud Infrastructure will initially power DGX Cloud, but NVIDIA claims that it will expand to include Microsoft Azure, Google Cloud, and other providers “soon” after that.
So what should you do with all of that AI knowledge? A simpler method for businesses to create their own Large Language Models (like ChatGPT) and generative AI has been unveiled by NVIDIA as well. Large corporations already use it to create their own LLMs, including Adobe, Getty Images, and Shutterstock. Additionally, NeMo, a language-specific service, and NVIDIA Picasso, an image, video, and 3D service, are directly connected to DGX Cloud.
NVIDIA demonstrated four new inference platforms with DGX Cloud, including NVIDIA L4, which the firm claims offers “120x more AI-powered video performance than CPUs, combined with 99% better energy efficiency.” L4 can also be used to create AI videos, stream videos, encode and decode video, and more. Additionally, there is the 2D and 3D picture generation NVIDIA L40 as well as the LLM solution NVIDIA H100 NVL, which has 94GB of memory and an accelerated Transformer Engine. (That helps deliver 12-times faster GPT3 inference performance compared to the A100, according to NVIDIA.)
The last inference platform is NVIDIA Grace Hopper for Recommendation Models, which performs exactly what it says on the tin. Additionally, it can power vector databases and graph neural networks in addition to being designed for recommendations.
NVIDIA L4 will be ready for preview on Google Cloud G2 machines today if you’re curious to see it in action. The generative AI video tool Descript and the art app WOMBO are both already using L4 through Google Cloud, according to a separate announcement from Google and NVIDIA.