Comparison of cluster software
The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).
General information
Software | Maintainer | Category | Development status | Latest release | ArchitectureOCS | High-Performance / High-Throughput Computing | License | Platforms supported | Cost | Paid support available |
---|---|---|---|---|---|---|---|---|---|---|
Amoeba | No active development | MIT | ||||||||
Base One Foundation Component Library | Proprietary | |||||||||
DIET | INRIA, SysFera, Open Source | All in one | GridRPC, SPMD, Hierarchical and distributed architecture, CORBA | HTC/HPC | CeCILL | Unix-like, Mac OS X, AIX | Free | |||
DxEnterprise | DH2i | Nodes management | Actively developed | v23.0 | Proprietary | Windows 2012R2/2016/2019/2022 and 8+, RHEL 7/8/9, CentOS 7, Ubuntu 16.04/18.04/20.04/22.04, SLES 15.4 | Cost | Yes | ||
Enduro/X | Mavimax, Ltd. | Job/Data Scheduler | actively developed | SOA Grid | HTC/HPC/HA | GPLv2 or Commercial | Linux, FreeBSD, MacOS, Solaris, AIX | Free / Cost | Yes | |
Ganglia | Monitoring | actively developed | 3.7.2[1] 14 June 2016 | BSD | Unix, Linux, Microsoft Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. | Free | ||||
Grid MP | Univa (formerly United Devices) | Job Scheduler | no active development | Distributed master/worker | HTC/HPC | Proprietary | Windows, Linux, Mac OS X, Solaris | Cost | ||
Apache Mesos | Apache | actively developed | Apache license v2.0 | Linux | Free | Yes | ||||
Moab Cluster Suite | Adaptive Computing | Job Scheduler | actively developed | HPC | Proprietary | Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms | Cost | Yes | ||
NetworkComputer | Runtime Design Automation | actively developed | HTC/HPC | Proprietary | Unix-like, Windows | Cost | ||||
OpenHPC | OpenHPC project | all in one | actively developed | v2.61 February 2, 2023 | HPC | Linux (CentOS / OpenSUSE Leap) | Free | No | ||
OpenLava | None. Formerly Teraproc | Job Scheduler | Halted by injunction | Master/Worker, multiple admin/submit nodes | HTC/HPC | Illegal due to being a pirated version of IBM Spectrum LSF | Linux | Not legally available | No | |
PBS Pro | Altair | Job Scheduler | actively developed | Master/worker distributed with fail-over | HPC/HTC | AGPL or Proprietary | Linux, Windows | Free or Cost | Yes | |
Proxmox Virtual Environment | Proxmox Server Solutions | Complete | actively developed | Open-source AGPLv3 | Linux, Windows, other operating systems are known to work and are community supported | Free | Yes | |||
Rocks Cluster Distribution | Open Source/NSF grant | All in one | actively developed | 7.0[2] (Manzanita) 1 December 2017 | HTC/HPC | OpenSource | CentOS | Free | ||
Popular Power | ||||||||||
ProActive | INRIA, ActiveEon, Open Source | All in one | actively developed | Master/Worker, SPMD, Distributed Component Model, Skeletons | HTC/HPC | GPL | Unix-like, Windows, Mac OS X | Free | ||
RPyC | Tomer Filiba | actively developed | MIT License | *nix/Windows | Free | |||||
SLURM | SchedMD | Job Scheduler | actively developed | v23.11.3 January 24, 2024 | HPC/HTC | GPL | Linux/*nix | Free | Yes | |
Spectrum LSF | IBM | Job Scheduler | actively developed | Master node with failover/exec clients, multiple admin/submit nodes, Suite addOns | HPC/HTC | Proprietary | Unix, Linux, Windows | Cost and Academic - model - Academic, Express, Standard, Advanced and Suites | Yes | |
Oracle Grid Engine | Oracle Grid Engine (Sun Grid Engine, SGE) | Altair | Job Scheduler | active Development moved to Altair Grid Engine | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Proprietary | *nix/Windows | Cost | ||
Some Grid Engine / Son of Grid Engine / Sun Grid Engine | daimh | Job Scheduler | actively developed (stable/maintenance) | Master node/exec clients, multiple admin/submit nodes | HPC/HTC | Open-source SISSL | *nix | Free | No | |
SynfiniWay | Fujitsu | actively developed | HPC/HTC | ? | Unix, Linux, Windows | Cost | ||||
Techila Distributed Computing Engine | Techila Technologies Ltd. | All in one | actively developed | Master/worker distributed | HTC | Proprietary | Linux, Windows | Cost | Yes | |
TORQUE Resource Manager | Adaptive Computing | Job Scheduler | actively developed | Proprietary | Linux, *nix | Cost | Yes | |||
UniCluster | Univa | All in One | Functionality and development moved to UniCloud (see above) | Free | Yes | |||||
UNICORE | ||||||||||
Xgrid | Apple Computer | |||||||||
Warewulf | Provision and clusters management | actively developed | v4.4.1 July 6, 2023 | HPC | Open Source | Linux | Free | |||
xCAT | Provision and clusters management | actively developed | v2.16.5 March 7, 2023 | HPC | Eclipse Public License | Linux | Free | |||
Software | Maintainer | Category | Development status | Latest release | Architecture | High-Performance/ High-Throughput Computing | License | Platforms supported | Cost | Paid support available |
Table explanation
- Software: The name of the application that is described
Technical information
Software | Implementation Language | Authentication | Encryption | Integrity | Global File System | Global File System + Kerberos | Heterogeneous/ Homogeneous exec node | Jobs priority | Group priority | Queue type | SMP aware | Max exec node | Max job submitted | CPU scavenging | Parallel job | Job checkpointing | Python interface |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Enduro/X | C/C++ | OS Authentication | GPG, AES-128, SHA1 | None | Any cluster Posix FS (gfs, gpfs, ocfs, etc.) | Any cluster Posix FS (gfs, gpfs, ocfs, etc.) | Heterogeneous | OS Nice level | OS Nice level | SOA Queues, FIFO | Yes | OS Limits | OS Limits | Yes | Yes | No | No |
HTCondor | C++ | GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous | None, Triple DES, BLOWFISH | None, MD5 | None, NFS, AFS | Not official, hack with ACL and NFS4 | Heterogeneous | Yes | Yes | Fair-share with some programmability | basic (hard separation into different node) | tested ~10000? | tested ~100000? | Yes | MPI, OpenMP, PVM | Yes | Yes, and native Python Binding |
PBS Pro | C/Python | OS Authentication, Munge | Any, e.g., NFS, Lustre, GPFS, AFS | Limited availability | Heterogeneous | Yes | Yes | Fully configurable | Yes | tested ~50,000 | Millions | Yes | MPI, OpenMP | Yes | Yes | ||
OpenLava | C/C++ | OS authentication | None | NFS | Heterogeneous Linux | Yes | Yes | Configurable | Yes | Yes, supports preemption based on priority | Yes | Yes | No | ||||
Slurm | C | Munge, None, Kerberos | Heterogeneous | Yes | Yes | Multifactor Fair-share | yes | tested 120k | tested 100k | No | Yes | Yes | PySlurm | ||||
Spectrum LSF | C/C++ | Multiple - OS Authentication/Kerberos | Optional | Optional | Any - GPFS/Spectrum Scale, NFS, SMB | Any - GPFS/Spectrum Scale, NFS, SMB | Heterogeneous - HW and OS agnostic (AIX, Linux or Windows) | Policy based - no queue to computenode binding | Policy based - no queue to computegroup binding | Batch, interactive, checkpointing, parallel and combinations | yes and GPU aware (GPU License free) | > 9.000 compute hots | > 4 mio jobs a day | Yes, supports preemption based on priority, supports checkpointing/resume | Yes, fx parallel submissions for job collaboration over fx MPI | Yes, with support for user, kernel or library level checkpointing environments | Yes |
Torque | C | SSH, munge | None, any | Heterogeneous | Yes | Yes | Programmable | Yes | tested | tested | Yes | Yes | Yes | Yes | |||
Software | Implementation Language | Authentication | Encryption | Integrity | Global File System | Global File System + Kerberos | Heterogeneous/ Homogeneous exec node | Jobs priority | Group priority | Queue type | SMP aware | Max exec node | Max job submitted | CPU scavenging | Parallel job | Job checkpointing |
Table Explanation
- Software: The name of the application that is described
- SMP aware:
- basic: hard split into multiple virtual host
- basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
- dynamic: split the resource of the computer (CPU/Ram) on demand
See also
- List of volunteer computing projects
- List of cluster management software
- Computer cluster
- Grid computing
- World Community Grid
- Distributed computing
- Distributed resource management
- High-Throughput Computing
- Job Processing Cycle
- Batch processing
- Fallacies of Distributed Computing
References
- ^ "Release 3.7.2".
- ^ "Rocks 7.0 is Released". 1 December 2017. Retrieved 17 November 2022.