A Leap Forward: Nvidia Unveils the Blackwell B200 GPU

Nvidia has made significant strides in the AI chip market with its highly sought-after H100 AI chip, propelling the company’s value beyond tech giants like Alphabet and Amazon. Now, Nvidia is set to extend its lead even further with the introduction of the new Blackwell B200 GPU and GB200 “superchip.”

The Blackwell B200 GPU boasts an impressive 208 billion transistors, offering up to 20 petaflops of FP4 performance. Nvidia claims that when combined with a single Grace CPU, the GB200 superchip can provide up to 30 times the performance for LLM inference workloads while also delivering substantial energy efficiency. It is said to reduce cost and energy consumption by up to 25 times compared to the H100.

Training a 1.8 trillion parameter model previously required 8,000 Hopper GPUs and 15 megawatts of power, but Nvidia asserts that with just 2,000 Blackwell GPUs, the same task can be accomplished with only four megawatts of power. The GB200 also exhibits notable performance improvements, with seven times the performance and four times the training speed compared to the H100, according to Nvidia’s benchmark testing on a GPT-3 LLM benchmark with 175 billion parameters.

Nvidia attributes these advancements to two key factors. First, the Blackwell GPUs utilize a second-gen transformer engine that doubles compute, bandwidth, and model size by employing four bits for each neuron instead of eight. Secondly, a next-gen NVLink switch enables seamless communication between large numbers of GPUs. This switch allows 576 GPUs to interact with one another, boasting 1.8 terabytes per second of bidirectional bandwidth. To achieve this, Nvidia developed a new network switch chip with 50 billion transistors and its own onboard compute capabilities.

Previously, Nvidia encountered communication bottlenecks, with just 16 GPUs spending 60% of their time exchanging data rather than computing. However, the Blackwell architecture addresses this challenge, enabling better utilization of computing power.

Nvidia anticipates significant demand for these GPUs and has designed larger packages to accommodate this. The GB200 NVL72, for instance, incorporates 36 CPUs and 72 GPUs into a single liquid-cooled rack, delivering an impressive 720 petaflops of AI training performance or 1,440 petaflops of inference. With nearly two miles of cables and 5,000 individual cables, this rack represents a major leap in computational power.

Notable cloud service providers such as Amazon, Google, Microsoft, and Oracle have expressed interest in offering the NVL72 racks. Nvidia is also keen to provide comprehensive solutions to companies, such as the DGX Superpod for DGX GB200. This system combines eight units into one, featuring 288 CPUs, 576 GPUs, 240TB of memory, and a staggering 11.5 exaflops of FP4 computing power.

Nvidia’s vision extends beyond individual GPU units. They envision systems that can scale up to tens of thousands of GB200 superchips, connected via advanced networking technologies like Quantum-X800 InfiniBand or Spectrum-X800 Ethernet.

Although this announcement came from Nvidia’s GPU Technology Conference, which predominantly focuses on GPU computing and AI rather than gaming, it is likely that the Blackwell GPU architecture will also power future gaming GPUs, including the anticipated RTX 50-series lineup.

FAQ

What is the Blackwell B200 GPU?

The Blackwell B200 GPU is Nvidia’s latest graphics processing unit designed to deliver exceptional performance in the field of artificial intelligence.

What is the advantage of the GB200 superchip?

The GB200 superchip combines two Blackwell B200 GPUs with a single Grace CPU, offering significant performance improvements and energy efficiency for LLM inference workloads.

How does the Blackwell architecture enhance communication between GPUs?

The second-gen transformer engine of the Blackwell GPUs doubles compute, bandwidth, and model size by utilizing four bits for each neuron. Additionally, the next-gen NVLink switch allows for seamless communication between large numbers of GPUs, significantly improving overall performance.

What are the potential applications of the Blackwell B200 GPU?

The Blackwell B200 GPU has diverse applications in AI training and inference, enabling organizations to perform tasks such as language processing, image recognition, and data analysis more efficiently.

Which companies have shown interest in the NVL72 racks?

Major cloud service providers, including Amazon, Google, Microsoft, and Oracle, have expressed interest in incorporating the NVL72 racks into their offerings.

Sources:
– https://www.nvidia.com/
– https://www.theverge.com/

Nvidia’s advancements in the AI chip market have positioned the company ahead of tech giants like Alphabet and Amazon. With the introduction of the Blackwell B200 GPU and GB200 “superchip,” Nvidia is set to extend its lead even further.

The Blackwell B200 GPU stands out with its impressive 208 billion transistors, offering up to 20 petaflops of FP4 performance. When combined with a single Grace CPU, the GB200 superchip can deliver up to 30 times the performance for LLM inference workloads while also offering substantial energy efficiency. In fact, it can reduce cost and energy consumption by up to 25 times compared to its predecessor, the H100.

Training a 1.8 trillion parameter model previously required 8,000 Hopper GPUs and 15 megawatts of power. However, Nvidia claims that with just 2,000 Blackwell GPUs, the same task can be accomplished with only four megawatts of power. The GB200 also showcases significant performance improvements, with seven times the performance and four times the training speed compared to the H100, according to Nvidia’s benchmark testing.

These advancements are attributed to the Blackwell GPUs’ second-gen transformer engine, which doubles compute, bandwidth, and model size by utilizing four bits for each neuron instead of eight. Additionally, the next-gen NVLink switch allows for seamless communication between large numbers of GPUs, enabling 576 GPUs to interact with one another and boasting 1.8 terabytes per second of bidirectional bandwidth. Nvidia developed a new network switch chip with 50 billion transistors and its own onboard compute capabilities to achieve this.

Nvidia foresees significant demand for these GPUs and has designed larger packages to accommodate it. For example, the GB200 NVL72 incorporates 36 CPUs and 72 GPUs into a single liquid-cooled rack, delivering impressive AI training performance and inference capabilities. Notable cloud service providers like Amazon, Google, Microsoft, and Oracle have already expressed their interest in offering these racks.

Nvidia also aims to provide comprehensive solutions to companies with the DGX Superpod for DGX GB200. This system combines eight units into one, featuring 288 CPUs, 576 GPUs, 240TB of memory, and 11.5 exaflops of FP4 computing power.

Looking forward, Nvidia envisions systems that can scale up to tens of thousands of GB200 superchips, connected via advanced networking technologies like Quantum-X800 InfiniBand or Spectrum-X800 Ethernet.

While the announcement of the Blackwell GPU architecture came from Nvidia’s GPU Technology Conference, which primarily focuses on GPU computing and AI, it is expected that this architecture will also power future gaming GPUs, including the anticipated RTX 50-series lineup.

Sources:
– Nvidia official website: https://www.nvidia.com/
– The Verge: https://www.theverge.com/

The source of the article is from the blog girabetim.com.br

Privacy policy
Contact