Overview of the PPPL Research Clusters

The PPPL Cluster provides easy to use and convenient computing resources for PPPL. It provides a computing facility for the small to mid size jobs not favored at the leadership computing facilities.

It consists of several sub-clusters.

The entry point to the PPPL clusters is portal.pppl.gov. Users login to the cluster through this load balanced pool of systems using their PPPL Unix accounts. To get an account, send email to lscimeca@pppl.gov. On portal, users test programs, stage their jobs, submit their jobs, and do other interactive tasks.

PPPL's clusters are typically named after notable PPPL research physicists:

The dawson cluster is the main cluster with thousands of CPU cores and 2GB of memory per core, and is used for parallel (>4 CPU) jobs. It is linked via a low latency private 10Gbps network. Named after John Dawson.
The ellis cluster has ~160 CPUs and 4GB of memory per core. It is used exclusively for small (1-4) CPU jobs. It is linked via a 1Gb private network. Named after Robert Ellis.
The kruskal cluster has 36 systems, each with 32 cpu cores and 2GB of memory per core, linked via a 40Gb Infiniband interconnect. Named after Martin Kruskal.
The greene cluster has 48 systems, each with 16 cpus and a large memory size (32-128GB of RAM per system), linked via a 40Gb Infiniband interconnect. This cluster is used mainly by the XGC and M3D groups. Named after John Greene.
The ganesh cluster has systems with large memory size (>= 4GB of RAM per CPU). Several systems contain 32 cores/192GB of memory, and two have 64 cores/384GB. These systems are useful for very large simulations or modeling applications. Named after the Hindu elephant god Ganesh.
The GPU cluster has systems with NVIDIA Tesla GPU cards. The system gpusrv01 contains two Nvidia 2070 GPU cards, each with 448 GPU cores (for a total of 896 gpu cores). The system gpusrv02 has two Nvidia K20 GPU cards, each with 2496 GPU cores (for a total of 4992 gpu cores). Each system has 32 CPU cores.

Based on the job attributes specified in a job script, the job scheduler will select the right cluster and set of systems upon which to run a user's job, queue the job for execution, and, if an appropriate set of systems is available, execute the job. Therefore, it is not necessary for the user to specify a job queue in their job script, unless a special interconnect (ex. Infiniband), large memory, or other feature is required.

The PPPL cluster also offers interactive use of its nodes. Users can enter the use command on portal to reserve one or more systems for their exclusive use.

Users' home directories (/u) and project directories (/p) are served by very high speed UNIX/ZFS fileservers which are responsible for storage and backup of files. Home directories are limited in size and thus should not be used for project data. Most projects have directories already created, and new users can be added to the group that has write access to that project, but for new projects, a new project directory can be created by submitting a request in the helpdesk ticketing system.