Home Latest Feeds Technology News NVIDIA B200 Tensor Core – graphics accelerator based on the Blackwell architecture. On board, among others: 192 GB HBM3e memory

NVIDIA B200 Tensor Core – graphics accelerator based on the Blackwell architecture. On board, among others: 192 GB HBM3e memory

0
NVIDIA B200 Tensor Core – graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory

[ad_1]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memoryAround the same time, two years ago, NVIDIA presented the first graphics accelerator, based on the Hopper architecture. We are talking about the NVIDIA H100 system, which today is one of the key factors influencing the company’s financial results. Two years after revealing the Hopper architecture, it’s time to present its successor. We are talking about the Blackwell generation, which this time has been prepared for both professional applications and solutions for gamers. But first, it’s time to reveal the B200 accelerator, which is supposed to be a groundbreaking AI computing system. At the same time, it will also be a huge chip, because we are talking about two GPUs combined into one whole.

During the opening conference at GTC 2024, NVIDIA revealed the B200 graphics accelerator, based on the Blackwell architecture. This is also the first look at the new generation, which will later be expanded to include solutions for, among others: about players.

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [1]

NVIDIA H200 – new accelerator with HBM3e memory. The company also announces the Jupiter supercomputer with GH200 chips

The NVIDIA B200 accelerator is actually a huge system, because it is actually two connected chips that work as one whole. Each core has 104 billion transistors, which gives a total of 208 billion. As it was already said before the conference, NVIDIA ultimately decided not to switch to the TSMC N3 technological process due to unsatisfactory lithography parameters. Ultimately, the manufacturer and TSMC went for a more extensive “4N” version, known from the Hopper and Ada Lovelace architectures. This time, however, the process was named TSMC 4NP, which most likely means further modifications to the basic 5 nm lithography.

NVIDIA B200 NVIDIA H200 (SMX5) NVIDIA H100 (SMX5)
Architecture Blackwell Hopper Hopper
Technological process TSMC 4NP TSMC 4N TSMC 4N
Number of transistors 208 billion 80 billion 80 billion
SM blocks 160 132 132
CUDA FP32 cores 20,480 16,896 16,896
Tensor Cores Yes – 5th generation Yes – 4th generation Yes – 4th generation
Supported instructions by Tensor cores FP64, TF32, BF16, FP16, FP8, INT8, FP6, FP4 FP64, TF32, BF16, FP16, FP8, INT8 FP64, TF32, BF16, FP16, FP8, INT8
TOPS Power (AI) 20,000 TOPS 3958 TOPS 3958 TOPS
Memory 192GB HBM3e 141GB HBM3e 80GB HBM3e
Memory speed 8Gbps 6.5Gbps 5.2Gbps
Memory bus 8192-bit 5120-bit 5120-bit
Capacity 8TB/s 4.8TB/s 3TB/s
TDP 700 W 700 W 700 W

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [2]

NVIDIA B200 may have huge power consumption. Dell is already preparing for such a scenario

NVIDIA B200 Tensor Core offers a total of 160 SM blocks and 20,480 CUDA processors. The Blackwell architecture in its presented form will offer AI computing power of 20 PFLOPS, 2.5x more compared to the Hopper architecture. NVIDIA B200 also offers 8 HBM3e modules (24 GB VRAM each) with a total capacity of 192 GB, using an 8192-bit bus and a bandwidth of up to an impressive 8 TB/s. The TDP factor at peak reaches 700 W (the same as with the NVIDIA H100 SXM5 accelerator). One of the novelties of the Blackwell architecture are new generation libraries for the Transformer Engine. With support for microtensor scaling combined with advanced dynamic range management algorithms integrated with NVIDIA TensorRT-LLM and NeMo Megatron libraries, the Blackwell architecture is able to support twice the computation and model size with AI inference capabilities using 4-bit floating point (FP4) and 6-bit (FP6).

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [3]

The Blackwell architecture, in order to increase performance for multi-billion parameters and create much larger AI models, offers support for the latest 5th generation NVLink. It offers a record-breaking bidirectional bandwidth of 1.8 TB/s per graphics chip, ensuring smooth and fast communication between up to 576 Blackwell chips. Another novelty is the implementation of the RAS engine, which aims to increase reliability, availability and facilitate the servicing process. Additionally, the Blackwell architecture, at the core level, offers the ability to use AI-based predictive maintenance to perform diagnostics and predict possible reliability issues in advance. The goal is primarily to maximize the time during which the Blackwell architecture-based system operates at full efficiency. NVIDIA Blackwell B200 systems also have a dedicated decompression engine that supports the latest formats and accelerates database queries to ensure the highest performance in analysis and science of existing data.

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [4]

Along with the NVIDIA B200 system, the manufacturer also presented one large superchip – GB200, consisting of two B200 accelerators and one NVIDIA Grace processor (consisting of 72 ARM Neoverse V2 cores). The platform built in this way is able to offer AI performance of 40 PFLOPS (petaFlops), including 864 GB of memory with a bandwidth of 16 TB/s. There will also be support for the latest PCIe 6.0 bus. The TDP of the GB200 system is expected to reach 2,700 W. Two Grace-Blackwell superchips (GB200) can be integrated with the main Blackwell computing node, thus providing AI computing power of up to 80 PFLOPS, offering 1.7 TB of HBM3e memory with a bandwidth of 32 TB/ p. Such a system will be built in a special MGX housing, which uses a liquid cooling system. The Grace-Blackwell GB200 superchip, according to the manufacturer, will be able to offer up to 30 times higher performance in AI calculations compared to the Grace-Hopper GH200 platform.

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [5]

The GB200 chips will also be part of a new computing platform called GB200 NVL72. One such platform will be equipped with 36 Grace processors and 72 Blackwell B200 accelerators (including 30 TB of HBM3e memory) and will offer AI computing power of 1.4 exaFLOPS. So far, such power has not been available for this type of structure. The GB200 NVL72 platforms will, in turn, be part of a complete server consisting of 32,000 B200 accelerators. A total of 13 PB of HBM3e memory will be available, and the computing power in AI tasks will reach 645 exaFLOPS. New systems and accelerators using the Blackwell architecture will be available to customers at a later date, although we are still talking about 2024.

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [6]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [7]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [8]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [9]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [10]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [11]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [12]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [13]
On the left – NVIDIA B200 Blackwell accelerator, on the left NVIDIA H100 Hopper

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [14]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [15]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [16]
NVIDIA Grace-Blackwell Superchip GB200

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [17]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [18]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [19]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [20]

NVIDIA B200 Tensor Core - graphics accelerator based on the Blackwell architecture.  On board, among others:  192 GB HBM3e memory [21]

Source: NVIDIA



[ad_2]

LEAVE A REPLY

Please enter your comment!
Please enter your name here