Artificial IntelligenceNVIDIA

NVIDIA unveils H200 to supercharge AI and HPC workloads

2 Mins read
NVIDIA H200

In a move set to redefine the landscape of artificial intelligence (AI) computing, NVIDIA has announced the release of its latest flagship chip, the HGX H200. This cutting-edge GPU represents a significant advancement over its predecessor, the H100, with 1.4 times more memory bandwidth and a substantial 1.8 times increase in memory capacity. These enhancements position the H200 as a game-changer in handling intensive generative AI workloads.

While the H200 shares many similarities with the H100, the key differentiator lies in its memory architecture. The H200 is the pioneer GPU to incorporate the all-new HBM3e memory specification, propelling its memory bandwidth to an impressive 4.8 terabytes per second—up from 3.35 terabytes per second in the H100. Additionally, the total memory capacity of the H200 has been augmented to 141GB, a substantial upgrade from its predecessor’s 80GB.

NVIDIA plans to release the first batch of H200 chips in the second quarter of 2024. The company is actively collaborating with “global system manufacturers and cloud service providers” to ensure widespread availability of the innovative GPU.

The introduction of the H200, built on the NVIDIA Hopper architecture is anticipated to result in substantial performance improvements, including almost doubling the inference speed on Llama 2—a 70 billion-parameter language model—compared to the H100. Further enhancements are expected with future software updates.

NVIDIA H200 is compatible with both four- and eight-way configurations on NVIDIA HGX H200 server boards. It is also a part of the NVIDIA GH200 Grace Hopper™ Superchip with HBM3e, offering versatility in deployment across different data center environments, such as on-premises, cloud, hybrid-cloud, and edge.

NVIDIA’s global ecosystem of partner server makers, including ASRock Rack, ASUS, Dell Technologies, Eviden, GIGABYTE, Hewlett Packard Enterprise, Ingrasys, Lenovo, QCT, Supermicro, Wistron, and Wiwynn, can seamlessly upgrade existing systems with the H200.

Leading cloud service providers, including Amazon Web Services, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, CoreWeave, Lambda, and Vultr, are set to be among the first to deploy H200-based instances starting next year.

The HGX H200, powered by NVIDIA NVLink™ and NVSwitch™ high-speed interconnects, stands out for its exceptional performance across various application workloads, notably excelling in LLM training and inference for models beyond 175 billion parameters. An eight-way HGX H200 delivers over 32 petaflops of FP8 deep learning compute and 1.1TB of aggregate high-bandwidth memory, making it the ideal choice for demanding generative AI and high-performance computing (HPC) applications.

When paired with NVIDIA Grace™ CPUs featuring an ultra-fast NVLink-C2C interconnect, the H200 contributes to the creation of the GH200 Grace Hopper Superchip with HBM3e. This integrated module is designed to cater to giant-scale HPC and AI applications.

To further facilitate AI acceleration, NVIDIA offers a comprehensive suite of software tools under its accelerated computing platform. The NVIDIA AI Enterprise suite of software empowers developers and enterprises to build and accelerate production-ready applications spanning a wide range of domains, from AI to HPC. This includes support for workloads such as speech, recommender systems, and hyperscale inference.

Read next: 6 ways AI can support digital content production ahead of 2024

Leave a Reply

Your email address will not be published. Required fields are marked *

48 − 43 =