Georgia Institute of Technology

Indiana University

Description

Jetstream2 is a hybrid-cloud platform that provides flexible, on-demand, programmable cyberinfrastructure tools ranging from interactive virtual machine services to a variety of infrastructure and orchestration services for research and education. The primary resource is a standard CPU resource consisting of AMD Milan 7713 CPUs with 128 cores per node and 512gb RAM per node connected by 100gbps ethernet to the spine.

Description

Jetstream2 is a hybrid-cloud platform that provides flexible, on-demand, programmable cyberinfrastructure tools ranging from interactive virtual machine services to a variety of infrastructure and orchestration services for research and education. The primary resource is a standard CPU resource consisting of AMD Milan 7713 CPUs with 128 cores per node and 512gb RAM per node connected by 100gbps ethernet to the spine.

Description

Jetstream2 GPU is a hybrid-cloud platform that provides flexible, on-demand, programmable cyberinfrastructure tools ranging from interactive virtual machine services to a variety of infrastructure and orchestration services for research and education. This particular portion of the resource is allocated separately from the primary resource and contains 360 NVIDIA A100 GPUs -- 4 GPUs per node, 128 AMD Milan cores, and 512gb RAM connected by 100gbps ethernet to the spine.

Description

Jetstream2 LM is a hybrid-cloud platform that provides flexible, on-demand, programmable cyberinfrastructure tools ranging from interactive virtual machine services to a variety of infrastructure and orchestration services for research and education. This particular portion of the resource is allocated separately from the primary resource and contains 32 nodes of GPU-ready 1TB RAM compute nodes, AMD Milan 7713 CPUs with 128 cores per node connected by 100gbps ethernet to the spine.

Institute for Advanced Computational Science at Stony Brook University

Description

Ookami is a computer technology testbed supported by the National Science Foundation under grant OAC 1927880. It provides researchers with access to the A64FX processor developed by Riken and Fujitsu for the Japanese path to exascale computing and is deployed in the, until June 2022, fastest computer in the world, Fugaku. It is the first such computer outside of Japan. By focusing on crucial architectural details, the ARM-based, multi-core, 512-bit SIMD-vector processor with ultrahigh-bandwidth memory promises to retain familiar and successful programming models while achieving very high performance for a wide range of applications. While being very power-efficient it supports a wide range of data types and enables both HPC and big data applications.

The Ookami HPE (formerly Cray) Apollo 80 system has 176 A64FX compute nodes each with 32GB of high-bandwidth memory and a 512 Gbyte SSD. This amounts to about 1.5M node hours per year. A high-performance Lustre filesystem provides about 0.8 Pbyte storage.

To facilitate users exploring current computer technologies and contrasting performance and programmability with the A64FX, Ookami also includes:

- 1 node with dual socket AMD Milan (64cores) with 512 Gbyte memory
- 2 nodes with dual socket Thunder X2 (64 cores) each with 256 Gbyte memory
- 1 node with dual socket Intel Skylake (32 cores) with 192 Gbyte memory and 2 NVIDIA V100 GPUs

National Center for Supercomputing Applications

Description

The Delta CPU resource comprises 124 dual-socket compute nodes for general purpose computation across a broad range of domains able to benefit from the scalar and multi-core performance provided by the CPUs, such as appropriately scaled weather and climate, hydrodynamics, astrophysics, and engineering modeling and simulation, and other domains using algorithms not yet adapted for the GPU. Each Delta CPU node is configured with 2 AMD EPYC 7763 (“Milan”) processors with 64-cores/socket (128-cores/node) at 2.45GHz and 256GB of DDR4-3200 RAM. An 800GB, NVMe solid-state disk is available for use as local scratch space during job execution. All Delta CPU compute nodes are interconnected to each other and to the Delta storage resource by a 100 Gb/sec HPE Slingshot network fabric.

Description

The Delta GPU resource comprises 4 different node configurations intended to support accelerated computation across a broad range of domains such as soft-matter physics, molecular dynamics, replica-exchange molecular dynamics, machine learning, deep learning, natural language processing, textual analysis, visualization, ray tracing, and accelerated analysis of very large in-memory datasets. Delta is designed to support the transition of applications from CPU-only to using the GPU or hybrid CPU-GPU models. Delta GPU resource capacity is predominately provided by 200 single-socket nodes, each configured with 1 AMD EPYC 7763 (“Milan”) processors with 64-cores/socket (64-cores/node) at 2.55GHz and 256GB of DDR4-3200 RAM. Half of these single-socket GPU nodes (100 nodes) are configured with 4 NVIDIA A100 GPUs with 40GB HBM2 RAM and NVLink (400 total A100 GPUs); the remaining half (100 nodes) are configured with 4 NVIDIA A40 GPUs with 48GB GDDR6 RAM and PCIe 4.0 (400 total A40 GPUs). Rounding out the GPU resource is 6 additional “dense” GPU nodes, containing 8 GPUs each, in a dual-socket CPU configuration (128-cores per node) and 2TB of DDR4-3200 RAM but otherwise configured similarly to the single-socket GPU nodes. Within the “dense” GPU nodes, 5 nodes employ NVIDIA A100 GPUs (40 total A100 GPUs in “dense” configuration) and 1 node employs AMD MI100 GPUs (8 total MI100 GPUs) with 32GB HBM2 RAM. A 1.6TB, NVMe solid-state disk is available for use as local scratch space during job execution on each GPU node type. All Delta GPU compute nodes are interconnected to each other and to the Delta storage resource by a 100 Gb/sec HPE Slingshot network fabric.

Description

The Delta Storage resource provides storage allocations for projects using the Delta CPU and Delta GPU resources. It delivers 7PB of capacity to projects on Delta and will be augmented by a later expansion of 3PB of flash capacity for high-speed, data-intensive workloads.

Description

The DeltaAI resource comprises 114 NVIDIA quad Grace Hopper nodes interconnected by HPE's Slingshot interconnect. Each Grace Hopper node consists of four NVIDIA super chips with one ARM based CPU, 128 GB of LP-DDR5 RAM and one H100 GPU with 96GB of HBM. The four super chips are tightly coupled with NVLink and share a unified shared memory space.

Open Science Grid

Description

A virtual HTCondor pool made up of resources from the Open Science Grid

Description

A virtual HTCondor pool made up of resources from the Open Science Grid

Open Storage Network

Description

The Open Storage Network (OSN) is an NSF-funded cloud storage resource, geographically distributed among several pods. OSN pods are currently hosted at SDSC, NCSA, MGHPCC, RENCI, and Johns Hopkins University. Each OSN pod currently hosts 1PB of storage, and is connected to R&E networks at 50 Gbps. OSN storage is allocated in buckets, and is accessed using S3 interfaces with tools like rclone, cyberduck, or the AWS cli.

Pittsburgh Supercomputing Center

Description

Anton is a special purpose supercomputer for biomolecular simulation designed and constructed by D. E. Shaw Research (DESRES). PSC's current system is known as Anton 2 and is a successor to the original Anton 1 machine hosted here.

Anton 2, the next-generation Anton supercomputer, is a 128 node system, made available without cost by DESRES for non-commercial research use by US universities and other not-for-profit institutions, and is hosted by PSC with support from the NIH National Institute of General Medical Sciences. It replaced the original Anton 1 system in the fall of 2016.

Anton was designed to dramatically increase the speed of molecular dynamics (MD) simulations compared with the previous state of the art, allowing biomedical researchers to understand the motions and interactions of proteins and other biologically important molecules over much longer time periods than was previously accessible to computational study. The MD research community is using the Anton 2 machine at PSC to investigate important biological phenomena that due to their intrinsically long time scales have been outside the reach of even the most powerful general-purpose scientific computers. Application areas include biomolecular energy transformation, ion channel selectivity and gating, drug interactions with proteins and nucleic acids, protein folding and protein-membrane signaling.

Description

Bridges-2 combines high-performance computing (HPC), high performance artificial intelligence (HPAI), and large-scale data management to support simulation and modeling, data analytics, community data, and complex workflows.

Bridges-2 Extreme Memory (EM) nodes enable memory-intensive genome sequence assembly, graph analytics, in-memory databases, statistics, and other applications that need a large amount of memory and for which distributed-memory implementations are not available. Bridges-2 Extreme Memory (EM) nodes each consist of 4 Intel Xeon Platinum 8260M “Cascade Lake” CPUs, 4TB of DDR4-2933 RAM, 7.68TB NVMe SSD. They are connected to Bridges-2's other compute nodes and its Ocean parallel filesystem and archive by two HDR-200 InfiniBand links, providing 400Gbps of bandwidth to read or write data from each EM node.

Description

Bridges-2 combines high-performance computing (HPC), high performance artificial intelligence (HPAI), and large-scale data management to support simulation and modeling, data analytics, community data, and complex workflows.

Bridges-2 Accelerated GPU (GPU) nodes are optimized for scalable artificial intelligence (AI; deep learning). They are also available for accelerated simulation and modeling applications. Bridges-2 GPU nodes each contain 8 NVIDIA Tesla V100-32GB SXM2 GPUs, providing 40,960 CUDA cores and 5,120 tensor cores. In addition, each node holds 2 Intel Xeon Gold 6248 CPUs; 512GB of DDR4-2933 RAM; and 7.68TB NVMe SSD. They are connected to Bridges-2's other compute nodes and its Ocean parallel filesystem and archive by two HDR-200 InfiniBand links, providing 400Gbps of bandwidth to enhance scalability of deep learning training.

Description

Bridges-2 combines high-performance computing (HPC), high performance artificial intelligence (HPAI), and large-scale data management to support simulation and modeling, data analytics, community data, and complex workflows.

Bridges-2 Accelerated GPU (GPU) nodes are optimized for scalable artificial intelligence (AI; deep learning). They are also available for accelerated simulation and modeling applications. Bridges-2 GPU nodes each contain 8 NVIDIA Tesla V100-32GB SXM2 GPUs, providing 40,960 CUDA cores and 5,120 tensor cores. In addition, each node holds 2 Intel Xeon Gold 6248 CPUs; 512GB of DDR4-2933 RAM; and 7.68TB NVMe SSD. They are connected to Bridges-2's other compute nodes and its Ocean parallel filesystem and archive by two HDR-200 InfiniBand links, providing 400Gbps of bandwidth to enhance scalability of deep learning training.

Description

Bridges-2 combines high-performance computing (HPC), high performance artificial intelligence (HPAI), and large-scale data management to support simulation and modeling, data analytics, community data, and complex workflows.

Bridges-2 Regular Memory (RM) nodes provide extremely powerful general-purpose computing, machine learning and data analytics, AI inferencing, and pre- and post-processing. Each Bridges RM node consists of two AMD EPYC “Rome” 7742 64-core CPUs, 256-512GB of RAM, and 3.84TB NVMe SSD. 488 Bridges-2 RM nodes have 256GB RAM, and 16 have 512GB RAM for more memory-intensive applications (see also Bridges-2 Extreme Memory nodes, each of which has 4TB of RAM). Bridges-2 RM nodes are connected to other Bridges-2 compute nodes and its Ocean parallel filesystem and archive by HDR-200 InfiniBand.

Description

The Bridges-2 Ocean data management system provides a unified, high-performance filesystem for active project data, archive, and resilience. Ocean consists of two tiers, disk and tape, transparently managed by HPE DMF as a single, highly usable namespace.

Ocean's disk subsystem, for active project data, is a high-performance, internally resilient Lustre parallel filesystem with 15PB of usable capacity, configured to deliver up to 129GB/s and 142GB/s of read and write bandwidth, respectively.

Ocean's tape subsystem, for archive and additional resilience, is a high-performance tape library with 7.2PB of uncompressed capacity (estimated 8.6PB compressed, with compression done transparently in hardware with no performance overhead), configured to deliver 50TB/hour.

Purdue University

Description

Purdue's Anvil cluster built in partnership with Dell and AMD consists of 1,000 nodes with two 64-core AMD EPYC "Milan" processors each and delivers over 1 billion CPU core hours each year, with a peak performance of 5.1 petaflops. Each of these nodes has 256GB of DDR4-3200 memory. A separate set of 32 large memory nodes has 1TB of DDR4-3200 memory each. Anvil's nodes are interconnected with 100 Gbps Mellanox HDR100 InfiniBand.

Description

16 nodes each with four NVIDIA A100 Tensor Core GPUs providing 1.5 PF of single-precision performance to support machine learning and artificial intelligence applications.

San Diego Supercomputer Center

Description

Expanse is a Dell integrated compute cluster, with AMD Rome processors, 128 cores per node, interconnected with Mellanox HDR InfiniBand in a hybrid fat-tree topology. The compute node section of Expanse has a peak performance of 3.373 PF. Full bisection bandwidth is available at rack level (56 compute nodes) with HDR100 connectivity to each node. HDR200 switches are used at the rack level and 3:1 oversubscription cross-rack. Compute nodes feature 1TB of NVMe storage and 256GB of DRAM per node. The system also features 12PB of Lustre based performance storage (140GB/s aggregate), and 7PB of Ceph based object storage.

Description

Expanse is a Dell integrated compute cluster, with AMD Rome processors, NVIDIA V100 GPUs, interconnected with Mellanox HDR InfiniBand in a hybrid fat-tree topology. The GPU component of Expanse features 52 GPU nodes, each containing four NVIDIA V100s (32 GB SMX2), connected via NVLINK, and dual 20-core Intel Xeon 6248 CPUs. They feature 1.6TB of NVMe storage and 256GB of DRAM per node. There is HDR100 connectivity to each node. The system also features 12PB of Lustre based performance storage (140GB/s aggregate), and 7PB of Ceph based object storage.

Description

5PB of storage on a Lustre based filesystem.

Description

Voyager is a heterogeneous system designed to support complex deep learning AI workflows. The system features 42 Intel Habana Gaudi training nodes, each with 8 training processors (336 in total). Each training node has 512GB of memory and 6.4TB of node local NVMe storage. The Gaudi training processors feature specialized hardware units for AI, HBM2, and on-chip high-speed Ethernet. The on-chip ethernet ports are used in a non-blocking all-to-all network between processors on a node and the remaining ports are aggregated into 6 400G connections on each node that are plugged into a 400G Arista switch to provide scale out of network. Voyager also has two first-generation inference nodes, each with 8 inference processors (16 in total). In addition to the custom AI hardware, the system also has 36 Intel x86 processors compute nodes for general purpose computing and data processing. Voyager features 3PB of storage currently deployed as a Ceph filesystem.

Texas A&M University

Description

ACES is a Dell cluster with a rich accelerator testbed consisting of Intel Max GPUs (Graphics Processing Units), Intel FPGAs (Field Programmable Gate Arrays), NVIDIA H100 and A30 GPUs, NEC Vector Engines, NextSilicon co-processors, Graphcore IPUs (Intelligence Processing Units). The ACES cluster consists of compute nodes using a mix of the following processors:

Intel Xeon 8468 Sapphire Rapids processors
Intel Xeon Ice Lake 8352Y processors
Intel Xeon Cascade Lake 8268 processors
AMD Epyc Rome 7742 processors

The compute nodes are interconnected with NVIDIA NDR200 connections for MPI and access to the Lustre storage. The Intel Optane SSDs and all accelerators (except the Graphcore IPUs and NEC Vector Engines) are accessed using Liqid's composable infrustructre via PCIe (Peripheral Component Interconnect express) Gen4 and Gen5 fabrics.

Description

FASTER (Fostering Accelerated Scientific Transformations, Education and Research) is funded by the NSF MRI program (Award #2019129) and provides a composable high-performance data-analysis and computing instrument. The FASTER system has 180 compute nodes with 2 Intel 32-core Ice Lake processors and 256 GB RAM, and includes 240 NVIDIA GPUs (40 A100 and 200 T4 GPUs). Using LIQID’s composable technology, all 180 compute nodes have access to the pool of available GPUs, dramatically improving workflow scalability. FASTER will have HDR InfiniBand interconnection and access/share a 5PB usable high-performance storage system running Lustre filesystem. thirty percent of FASTER’s computing resources will be allocated to researchers nationwide through XSEDE’s XRAC process.

Description

Launch is a Dell Linux cluster with 45 compute nodes (8,640 cores) and 2 login nodes. There are 35 compute nodes with 384 GB memory and 10 GPU compute nodes with 768 GB memory and two NVIDIA A30s. The interconnecting fabric uses a single NVIDIA HDR100 InfiniBand switch.

Texas Advanced Computing Center

Description

The Stampede2 Dell/Intel Knights Landing (KNL), Skylake (SKX) System provides the user community access to two Intel Xeon compute technologies.

The system is configured with 4204 Dell KNL compute nodes, each with a stand-alone Intel Xeon Phi Knights Landing bootable processor. Each KNL node includes 68 cores, 16GB MCDRAM, 96GB DDR-4 memory and a 200GB SSD drive.

Stampede2 also includes 1736 Intel Xeon Skylake (SKX) nodes and additional management nodes. Each SKX includes 48 cores, 192GB DDR-4 memory, and a 200GB SSD.

Allocations awarded on Stampede2 may be used on either or both of the node types.

Compute nodes have access to dedicated Lustre Parallel file systems totaling 28PB raw, provided by Cray. An Intel Omni-Path Architecture switch fabric connects the nodes and storage through a fat-tree topology with a point to point bandwidth of 100 Gb/s (unidirectional speed). 16 additional login and management servers complete the system. Stampede2 will deliver an estimated 18PF of peak performance.

Please see the Stampede2 User Guide for detailed information on the system and how to most effectively use it.

https://portal.xsede.org/tacc-stampede2

Description

TACC's long-term mass storage solution, Ranch, is an Oracle® StorageTek Modular Library System. Ranch utilizes Oracle's Sun Storage Archive Manager Filesystem (SAM-FS) for migrating files to/from a tape archival system with a current offline storage capacity of 60 PB.
Ranch's disk cache is built on Oracle's ZFS 7240 and Dell MD3600i disk arrays containing approximately 640 TB of usable spinning disk storage. These disk arrays are controlled by a Dell R720 SAM-FS Metadata server which has 16 CPUs and 72 GB of RAM.
Two Oracle StorageTek SL8500 Automated Tape Libraries house all of the offline archival storage. Each SL8500 library can house up to 10,000 tapes with 64 tape drive slots. One SL8500 is currently populated with 10,000 T-10000B media where each tape is capable of holding one TB of uncompressed data while the second SL8500 houses 6,000 of the latest T-10000C media which can hold five TB of uncompressed data. Each SL8500 library also contains eight handbots to manage tapes and move them to/from the tape drives with a pass-through port connecting the two SL8500 libraries. If necessary, up to four SL8500 libraries can be integrated into a single archival solution, allowing for an offline storage capacity of 200 PB with current tape media.

University of Delaware

Description

Nodes with two AMD EPYC™ 7502 processors (32 cores each) with three memory size options:
48x standard 512 GiB;
32x large-memory 1024 GiB;
11x xlarge-memory 2048 GiB;
1x lg-swap 1024 GiB RAM + 2.73 TiB Intel Optane NVMe swap

Description

3 nodes with two Intel® Xeon® Platinum 8260 processors (24 cores each), 768 GiB RAM, and 4 NVIDIA Tesla V100 32GB GPUs connected via NVLINK™
9 nodes with two AMD EPYC™ 7502 processors (32 cores each), 512 GiB RAM, and a single NVIDIA Tesla T4 GPU
1 node with two AMD EPYC™ 7502 processors (32 cores each), 512 GiB RAM, and a single AMD Radeon Instinct MI50 GPU

Description

DARWIN's Lustre file system is for use with the DARWIN Compute and GPU nodes.

University of Kentucky

Description

Five large memory compute nodes dedicated for XSEDE allocation. Each of these nodes have 40 cores (Broadwell class and lntel(R) Xeon(R) CPU E7-4820 v4 @ 2.00GHz with 4 sockets, 10 cores/socket), 3TB RAM, and 6TB SSD storage drives. The 5 dedicated XSEDE nodes will have exclusive access to approximately 300 TB of network attached disk storage. All these compute nodes are interconnected through a 100 Gigabit Ethernet (l00GbE) backbone and the cluster login and data transfer nodes will be connected through a 100Gb uplink to lnternet2 for external connections.

University of Texas at Austin

Description

Stampede3 is generously funded through the National Science Foundation and is designed to serve today's researchers as well as support the research community on an evolutionary path toward many-core processors and accelerated technologies. Stampede 3 maintains the familiar programming model for all of today's users, and thus will be broadly useful for traditional simulation users, users performing data intensive computations, and emerging classes of new users.