Tuesday, November 30

Nvidia Technology Powers 70% of the World’s Top 500 Supercomputers | Ruri Web

[기사 본문]

NVIDIA, a leader in artificial intelligence (AI) computing technology,www.nvidia.co.kr, CEO Jensen Huang) announced at the Supercomputing Conference 2021 (SC21) that 355 systems, or 70% of the Top 500 list of global supercomputers, are being accelerated by NVIDIA technology. Additionally, more than 90% of newly deployed systems employ NVIDIA technology.

In addition, 23 of the top 25 systems in the Green500, which screens the most energy-efficient systems, are powered by NVIDIA technology. On average, Nvidia GPU-based supercomputers are 3.5 times more energy efficient than a non-GPU Green500 system.

Microsoft’s GPU-accelerated Azure supercomputer climbed into the top 10, becoming the first cloud-based system to enter the top 10. AI is revolutionizing computing for scientific research. Recently, the number of papers using high-performance computing (HPC) and machine learning has rapidly increased, and the number of related papers submitted from about 600 papers in 2018 increased to 5,000 papers in 2020.

The continued convergence of HPC and AI workloads is also emerging in new benchmarks such as HPL-AI and MLPerf HPC.

HPL-AI is a convergence of HPC and AI workloads that use mixed-precision arithmetic (the basis of deep learning and various scientific research and commercial applications) while still providing the full accuracy of double-precision arithmetic (which serves as a standard measurer for traditional HPC benchmarks). It’s a new benchmark.

MLPerf HPC evaluates computing styles that accelerate and improve simulations on supercomputers with AI. Performance is measured based on the main workloads of the HPC center: astrophysics (Cosmoflow), weather (Deepcam), and molecular dynamics (Opencatalyst).

NVIDIA covers the full stack with GPU-accelerated processing, smart networking, GPU-optimized applications, and AI and HPC convergence support libraries. This approach has enabled us to accelerate our workloads and drive scientific innovation.

accelerated computing

In many use cases, the parallel processing capabilities of GPUs combined with over 2,500 GPU-optimized applications can reduce the time required for HPC operations from weeks to hours. As Nvidia continues to optimize its CUDA-X libraries and GPU-accelerated applications, it’s not uncommon to experience unpredictable but powerful performance boosts on the same GPU architecture.

As a result, the performance of some of the most widely used scientific applications, the so-called “golden suite,” has improved more than 16 times in the past six years, and more is expected in the future.

[이미지]  Performance of leading HPC, AI and ML applications has improved 16x with full stack innovation.png

Performance of leading HPC, AI and ML applications improved 16x with full stack innovation

Nvidia also offers the latest versions of its AI and HPC software as containers in its NGC catalog to help you quickly take advantage of its powerful performance. Now, users simply need to bring the application to their supercomputer, data center, or cloud and run it.

Convergence of HPC and AI

By convergence of HPC and AI, simulation can be accelerated while still achieving the accuracy of conventional simulation methods. This is why an increasing number of researchers are accelerating their work with AI. The same goes for four of the finalists for the Gordon Bell Award, the most prestigious award in the supercomputing category. In addition, organizations are racing to build exascale AI computers that will support new models that combine HPC and AI.

In addition, relatively new benchmarks, such as HPL-AI and MLPerf HPC, focus specifically on the performance of HPC and AI convergence models, reflecting the reality that HPC and AI workloads continue to converge. To further accelerate this trend, Nvidia has released a range of advanced libraries and new software development kits for HPC.

Graphs, the main data structure of modern data science, are now projected into deep neural network frameworks through a new Python package, Deep Graph Library (DGL). NVIDIA Modulus allows you to build and train physics-based machine learning models that can learn and follow the laws of physics. Nvidia also introduced the following new libraries:

# ReOpt – Improving Operational Efficiency in the $10 Trillion Logistics Industry

# cuQuantum – Accelerate Your Quantum Computing Research

# cuNumeric – NumPy acceleration for scientists, data scientists, and machine learning and AI researchers in the Python community

It’s the NVIDIA Omniverse that connects them all. Omnibus is a virtual world simulation and collaboration platform for 3D workflows. Omnibus is used for digital twin simulation of warehouses and factories, physical/biological systems, 5G edge, robots, autonomous vehicles, avatars, etc. Nvidia has announced plans to build an omnibus-based supercomputer, the E-2 (Earth-2). E-2 will be dedicated to the mission of predicting climate change by creating a digital twin of the planet.

Cloud Native Supercomputing

Across data analytics and AI, simulation and virtualization, the workload that supercomputers are responsible for is increasing. Accordingly, the load on the CPU to support the communication tasks accompanying the operation of a large and complex system is also increasing.

The data processing unit (DPU) offloads some of these processes to reduce stress on the CPU. NVIDIA BlueField DPU, a fully integrated data-center-on-a-chip platform, offloads and manages the data center’s infrastructure tasks instead of a host processor to better orchestrate and secure supercomputers. strengthen

The combination of the Bluefield DPU architecture and the NVIDIA Quantum InfiniBand platform provides optimal bare-metal performance while supporting multi-node tenant isolation in a native environment.

[이미지]  The NVIDIA Quantum InfiniBand platform provides predictable, bare-metal performance isolation.png

The NVIDIA Quantum InfiniBand platform provides predictable, bare-metal performance isolation.

These new systems are more secure thanks to a zero-trust approach. Bluefield DPUs isolate applications from the infrastructure, and NVIDIA DOCA 1.2, the latest Bluefield software platform, supports widespread use of next-generation distributed firewalls and per-line data encryption. And Nvidia Morpheus uses deep learning-based data science to detect the intruder’s activity in real time, assuming an intruder has entered the data center.

NVIDIA Quantum-2 is a 400 Gbps InfiniBand platform comprising Quantum-2 switches, ConnectX-7 NICs, Bluefield-3 DPUs, and new software for a new networking architecture. NVIDIA Quantum-2 provides the advantages of bare metal high-performance computing and secure multi-tenancy, enabling next-generation supercomputers to be more safely and effectively utilized based on cloud native.

Reporter Yoo Dong-shik [email protected]


Leave a Reply

Your email address will not be published. Required fields are marked *