SDSC Voyager Habana Training and Inference Processor based AI System
Voyager is a heterogeneous system designed to support complex deep learning AI workflows. The system features 42 Intel Habana Gaudi training nodes, each with 8 training processors (336 in total). Each training node has 512GB of memory and 6.4TB of node local NVMe storage. The Gaudi training processors feature specialized hardware units for AI, HBM2, and on-chip high-speed Ethernet. The on-chip ethernet ports are used in a non-blocking all-to-all network between processors on a node and the remaining ports are aggregated into 6 400G connections on each node that are plugged into a 400G Arista switch to provide scale out of network. Voyager also has two first-generation inference nodes, each with 8 inference processors (16 in total). In addition to the custom AI hardware, the system also has 36 Intel x86 processors compute nodes for general purpose computing and data processing. Voyager features 3PB of storage currently deployed as a Ceph filesystem.

San Diego Supercomputer Center
