HPC Center user guides

The use of the resources of the TalTech HPC Centre requires an active Uni-ID account (please ask at hpcsupport@taltech.ee to activate access), a procedure for non-employees/non-students can be found here (in Estonian), further the user needs to be added to the HPC-USERS group, please ask hpcsupport@taltech.ee to activate HPC access (from your UniID e-mail account). In the case of using licensed programs, the user must also be added to the appropriate group. More about available programs and licenses.
The cluster has a Linux operating system (based on CentOS; Debian or Ubuntu on special purpose nodes) and uses SLURM as a batch scheduler and resource manager. Linux is the dominating operating system used for scientific computing and of now is the only operating system present in the Top500 list (a list of the 500 most powerful computers in the world).
Linux command-line knowledge is essential for using the cluster. By learning Linux and using the TalTech clusters also necessary skills for accessing one of the international supercomputing centers (e.g. LUMI or any of the PRACE centers) are acquired.
Hardware Specification
- TalTech ETAIS Cloud: 4 node OpenStack cloud
5 compute (nova) nodes with 768GB of RAM and 80 threads each
65 TB CephFS storage (net capacity)
accessible through the ETAIS website: https://etais.ee/using/
- base.hpc.taltech.ee is the new cluster environment all nodes from HPC1 and HPC2 will be migrated here
SLURM v20 scheduler, a live load diagram
home directory file system has 1.5 PB storage, with a 2 TB/user quota
32 green nodes (former hpc2.ttu.ee nodes), 2 x Intel Xeon Gold 6148 20C 2.40 GHz, 96 GB DDR4-2666 R ECC RAM (green[1-32]), 25 Gbit Ethernet, 18 of these FDR InfiniBand (green-ib partition)
48 gray nodes (former hpc.ttu.ee nodes, migration in progress), 2 x Intel Xeon E5-2630L 6C with 64 GB RAM and 1 TB local drive, 1 Gbit Ethernet, QDR InfiniBand (gray-ib partition)
1 mem1tb large memory node, 1TB RAM, 4x Intel Xeon CPU E5-4640 (together 32 cores, 64 threads)
amp GPU nodes, specific guide for amp, amp1: 8xNvidia A100/40GB, 2x 64core AMD EPYC 7742 (together 128 cores, 256 threads), 1 TB RAM; amp2: 8xNvidia A100/80GB, 2x 64core AMD EPYC 7742 (together 128 cores, 256 threads), 2 TB RAM
viz.hpc.taltech.ee Visualization node (accessible within University network and FortiVPN), 2x nVidia Tesla K20Xm grapic cards (on displays :0.0 and :0.1)
SLURM partitions
partition |
default time |
time limit |
default memory |
nodes |
---|---|---|---|---|
short |
10 min |
2 hours |
1 GB/thread |
green |
common |
10 min |
8 days |
1 GB/thread |
green |
green-ib |
10 min |
8 days |
1 GB/thread |
green |
long |
10 min |
15 days |
1 GB/thread |
green |
gray-ib |
10 min |
8 days |
1 GB/thread |
gray |
gpu |
10 min |
5 days |
1 GB/thread |
amp |
mem1tb |
mem1tb |
Contents:
- Quickstart: Cloud
- Quickstart: Cluster
- LUMI
- Courses and introductions
- Module environment (lmod)
- Software packages
- Available MPI versions (and comparison)
- Performance
- Visualization
- GPU-server “amp”
- Containers (Singularity & Docker)
- Acknowledgement