Rack Scale AI Training Servers with Supermicro Liquid Cooling Solutions
AI servers are ideal for medium to very large AI training scenarios and contain dual AMD or Intel CPUs and eight high-performance NVIDIA GPUs. This purpose-built highest density turnkey rack scale solution is extremely scalable and customizable to meet any scale of Deep Learning workload demands.
Leveraging NVIDIA’s cutting-edge NVIDIA H100 SXM GPU, these rack-scale AI solutions deliver unprecedented Deep Learning performance.
The Rack Scale AI Solution is powered by Supermicro GPU servers - the highest density and compact computation powerhouse. The cluster utilizes the latest NVIDIA HGX™ H100 GPUs to deliver incomparable performance.
The design features 32 GPUs in the Base Package (Scalable Unit-SU), scaling up to 128 GPUs per POD (4 racks of servers) and 256 GPUs per SuperPOD (8 racks of servers).
Large-Scale NVIDIA H100 AI Training Solution with Liquid Cooling
- Supreme AI Cluster for Exascale Computing
- Scalable Design achieving unprecedented peak performance
- Most Advanced Processors & Networking
- Flexible and Superior Cooling Options
- Representative Performance Benchmarks
- Supermicro Advantages with Scale AI Solutions Plug and Play
Flexible and Superior Cooling Options
With the rising number of TDPs for both CPUs and GPUs, large-scale AI clusters will soon demand superior cooling technologies compared to air cooling.
Supermicro Rack Scale AI Solution offers air and liquid cooling options, which include Direct To Chip, Rear Door Heat Exchangers, and Immersion Cooling. Powered by high-quality liquid cooling components, Supermicro’s AI solution provides dramatic savings in PUE and OPEX.
Types of Liquid Cooling for your Data Center
Direct To Chip
Liquid passes directly on the surface of a chip and draws heat away. The liquid is then cooled through a liquid to liquid heat exchanger, either contained within the rack or externally
Immersive Cooling
The entire system is immersed in a liquid which cools all components. The warm liquid is then chilled and brought back into the tank.
Rear Door Heat Exchanger
The rear door of a rack contains several fans that draw hot air away from the servers and cool the air before the air is returned. The cooling liquid is chilled externally to the door.
Benefits of Different Liquid Cooling Options
Method of Cooling | Direct To Chip | Immersion | Rear Door Heat Exchanger |
Primary Benefit | Range of Servers can be used | Most Efficient | Least Disruptive |
Secondary Benefit | Lower fan speed, noise | Lowest/No Noise | Can be installed later |
Superior Effectiveness of Liquid Cooling
- Switching from Air Conditioning to More Effective Liquid Cooling Reduces OPEX by more than 40%
- Liquid Cooling Efficiency Dramatically Improves the PUE of Data Centers for High Performance, High Power CPUs, and GPUs
- Reduces Costs and Environmental Impact
Liquid Cooling GPU Server
GPU Super Server SYS-821GE-TNHR | |
Overview | 8U Dual Socket (4th Gen Intel® Xeon® Scalable Processors), up to 8 SXM5 GPUs |
CPU | 2x 4th Gen Intel Xeon Scalable Processors |
Memory (additional memory available) | 32 DIMM slots Up to 8TB: 32x 256 GB DRAM |
Graphics | 8x HGX H100 SXM5 GPUs (80GB, 700W TDP) |
Storage (additional storage available) | 8x 2.5” SATA 8x 2.5” NVMe U.2 Via PCIe Switches Additional 8x 2.5” NVMe U.2 Via PCIe Switches (option) 2x NVMe M.2 |
Power | 3+3 Redundant 6x 3000W Titanium Level Efficiency Power |
GPU Super Server AS -8125GS-TNHR | |
Overview | 8U Dual Socket (4th Gen AMD EPYC™), up to 8 SXM5 GPUs |
CPU | 2x 4th Gen AMD EPYC™ Processors |
Memory (additional memory available) | 24 DIMM slots Up to 6TB ECC DDR5-4800 RDIMM |
Graphics | 8x HGX H100 SXM5 GPUs (80GB, 700W TDP) |
Storage (additional storage available) | 8x 2.5” SATA 8x 2.5” NVMe U.2 Via PCIe Switches Additional 8x 2.5” NVMe U.2 Via PCIe Switches (option) 2x NVMe M.2 |
Power | 3+3 Redundant 6x 3000W Titanium Level Efficiency Power Supplies |
Supermicro Systems Available with Liquid Cooling
SYS-421GE-TNHR2-LCC, AS-4125GS-TNHR2-LCC
4U NVIDIA HGX H100 8-GPU Server
SYS-221GE-TNHT-LCC
2U NVIDIA HGX H100 4-GPU Server
ARS-111GL-NHR-LCC
1U NVIDIA GH200 Grace Hopper Superchip Server
ARS-111GL-DNHR-LCC
1U 2-Node NVIDIA GH200 Grace Hopper Superchip Server
AS-2145GH-TNMR
2U AMD APU Server
SYS-241E-TNRTTP
2U Intel® Multi-Processor Server
SYS-221BT-DNTR
2U2N BigTwin® Server
SYS-221BT-HNTR
2U4N BigTwin® Server
SBI-421E-1T3N
8U 20N SuperBlade®Server
SYS-821GE-TNHR, AS-8125GS-TNHR
8U 8GPU Server
SYS-421GE-TNR, AS-4125GS-TNRT
4U PCIe GPU Server
SYS-421GU-TNXR
4U 4GPU Server
SYS-121H-TNR, AS-1125HS-TNR
1U Hyper Server
SYS-221H-TNR, AS-2125HS-TNR
2U Hyper Server
SYS-F511E2-RT
4U8N FatTwin® Server
SYS-F521E3-RTB
4U4N FatTwin® Server