What is Fluid-Slurm-GCP?
Fluid Numerics' Slurm-GCP deployment leverages Google Compute Engine resources and the Slurm job scheduler to execute high performance computing (HPC) and high throughput computing (HTC) workloads. Compute nodes are created on-the-fly to execute jobs using custom compute node images. Slurm automatically removes idle compute nodes to minimize the expense of unused compute resources.
Looking to experiment with operating your own HPC cluster? Our Google Cloud Marketplace is a great place to get started with a click-to-deploy solution. Within 30 minutes, you can be running HPC and HTC applications using all of Google's datacenters worldwide.
HPC Cluster with Terraform
Want to build out more complex infrastructure with a cloud-native HPC cluster and manage your resources using infrastructure-as-code? Use our terraform modules and examples to deploy and manage your fluid-slurm-gcp cluster with other infrastructure components.
Fully Managed HPC Cluster
Let us help you! Simply let us know what you want to see in a HPC cluster. We will take care of provisioning Cloud Identity accounts, secure IAM policies, networking infrastructure, and your cloud-native HPC cluster. When ready, you'll be able to ssh to your cluster like a traditional HPC system.
Learn how to quickly launch Fluid Numerics' Slurm-GCP deployment and submit your first job on the cluster.
July 2020 (v2.4.0)
- Upgrade to Slurm 20.02
- Add support for easy CloudSQL integration
- GSuite SMTP Email Relay Integration support for email notification on job completion
- Terraform modules and examples now publicly available!
- (bugfix) Enabled storage.full auth-scope for GCSFuse
April 2020 (v2.3.0)
- (feature upgrade) GCP Marketplace solutions now come with read-write access scopes to GCS storage
- (bugfix) Resolved issue on compute nodes with hyperthreading disabled causing incorrect core-count configuration in slurm.conf
- python/2.7.1 and python/3.8.0 are now available under /apps and through environment modules.
The cluster-services CLI has been updated with the Version 2.3.0 release of fluid-slurm-gcp. Updates include
- Updated help documentation
- The default_partition item has been added to the cluster-config schema which allows users to specify a default Slurm partition.
- --preview flag for all update commands allows you to preview the changes to your cluster prior to actually making the changes
cluster-services add user --nameflag removed. Individual users can be added to the default slurm account using
cluster-services add user <name>
- User's can now obtain template cluster-config blocks using
cluster-services sample all/mounts/partitions/slurm_accounts
- User provided cluster-configs are now validated against
- Added cluster-services logging to
- Fixed incorrect core count bug with the
- Removed add/remove mounts/partitions options; mounts and partitions are now updated by using
update mounts, and/or
update partitions calls.
add/remove usercall only adds or removes a user to the default Slurm account. These calls are strictly convenience calls.
- cluster-config schema now specified compute, controller, and login images in
login_image ratherthan in the
March 2020 (Fluid-Slurm-GCP+Ubuntu)
Fluid Numerics has released another flavor of fluid-slurm-gcp on GCP Marketplace that is based on the Ubuntu operating system!
In addition to the flexible multi-project/region/zone of "classic" fluid-slurm-gcp, the fluid-slurm-gcp-ubuntu solution includes
- Ubuntu 19.10 Operating System
- zfs-utils for ZFS filesystem management (but no Lustre kernels)
- apt package manager
- Environment modules, Spack, and Singularity (same as the classic fluid-slurm-gcp)
March 2020 (Fluid-Slurm-GCP+OpenHPC)
Fluid Numerics has released another flavor of fluid-slurm-gcp on GCP Marketplace with pre-installed OpenHPC packages.
In addition to the flexible multi-project/region/zone of "classic" fluid-slurm-gcp, the fluid-slurm-gcp+openhpc solution includes
- lmod Environment modules
- GCC 8.2.0 compilers
- MPICH, MVAPICH, and OpenMPI
- Serial and Parallel IO Libraries (HDF5, NetCDF, Adios)
- HPC Profilers/Performance Tuning Toolkits (Score-P, Tau, Scalasca)
- Scientific libraries for HPC ( MFEM, PETSc, Trilinos, and much more!)
February 2020 ( v2.0.0 )
Fluid Numerics has released upgrades to the Fluid-Slurm-GCP marketplace deployment and the cluster-services CLI toolkit in February 2020. These upgrades came about from cluster configuration schema changes that permit :
- Specification of multiple compute machines per partition
- Support for multiple GCP regions and multiple GCP zone
- User defined compute machine names
- User defined Slurm accounts, and user-partition alignment through cluster-services
- Multi-project ready cluster-configuration schema for simplified Orbitera billing platform integrations