Home Latest Feeds Technology News Nvidia’s new AI accelerator – PCW – comes with enormous computing capacity

Nvidia’s new AI accelerator – PCW – comes with enormous computing capacity

0
Nvidia’s new AI accelerator – PCW – comes with enormous computing capacity

[ad_1]

The Blackwell architecture was presented, after seeing the figures, the new reason system can cement the giant company’s market position.

This year’s GTC conference his opening lecture one of its protagonists is definitely Jensen Huang, who unswervingly leads Nvidia to higher and higher peaks, there was no shortage of messages reinforcing the importance of artificial intelligence in the first few minutes. Optimism and confident forward thinking in the AI ​​era is understandable, as the company has built a seemingly unassailable position for itself from the GPGPU era to Tensor GPUs, thanks to the Ampere and Hopper architectures, among others.

Due to the fast pace characteristic of the industry, the time has come for a generation change, and although a few data regarding Blackwell GPUs can be deduced from the rumors, it is difficult to guess the development curve of AI capabilities without knowing the specific hardware technology background. Well, it’s all come to light recently (well, almost), and it looks like the platforms equipped with the accelerators codenamed B200 and GB200 will bring an amazing leap in computing power and efficiency.

A five-fold jump compared to its predecessor

Perhaps this could have been the first sentence on the whiteboard on which the target numbers that were expected by the top management regarding Blackwell were written. Seeing how fast we are progressing in the use of artificial intelligence, a 50-100% development simply cannot fit in, in the endless fields of large language models and various algorithms, there is much greater computational demand. The B200, as an independent GPU, cannot therefore be considered a standard chip, a so-called a multi-chip design was used at Nvidia, that is, there are two compute chips inside one housing, each chip with its own HBM3E memory subsystem. The physical dimensions are accordingly above average, the 208 billion transistors can even be called a technological feat, thanks to TSMC’s fine-tuned N4P production technology. In comparison, the AMD Instinct MI300 with an integrated APU is based on 146 billion transistors, and the H100 accelerator, which has just been replaced in-house, has 80 billion transistors.


The B200 is actually based on the integration of two B100 chips, with fully coherent cache and memory usage, the latter with a maximum capacity of 192 GB. Using the HBM3E memory chips available to manufacturers, a data transfer speed of 8 TB/second can be achieved on a memory bus with a width of 4096 bits, while the two chiplets can communicate with each other at a speed of 10 TB/second.


Available with different interface (NVLink, PCIe Gen6), the novelty is built into DGX servers, according to Nvidia’s claim, they offer the most powerful x86-64-based computing capacity for AI tasks, but thanks to the GB200, the Arm line is also supported. In any case, like the GH200, the Grace processor equipped with 72 Neoverse v2 cores controls the special GPUs, competitors will have a hard time with the power and capabilities of the B200 duo integrated on the new superset.

Is it possible to guess the new GeForce top model based on this?

The GTC presentation was not intended for everyday use, but with the knowledge of the new chips, more and more data on the expected speed of the RTX 5090 can be received. It is certain that half of the B200 GPU structure can reach the home PC market in the best case, i.e. an approx. We can expect a GPU based on 100 billion transistors. There is no official confirmation of the number of CUDA and Tensor cores regarding Blackwell, but the RTX line will definitely receive the 5th generation Tensor parts, and the number of shader units may reach somewhere around 20,000 CUDA cores. Compared to the RTX 4090, a 25% increase can be expected, which can be further increased by more efficient operation and the next development stage of DLSS.

[ad_2]

LEAVE A REPLY

Please enter your comment!
Please enter your name here