Containers (Singularity & Docker)

Containers are a popular way of creating a reproducible software environment. Container solutions are Docker.io and Singularity, we support singularity.




Running a container


On green or gray nodes

There is a native installation from CentOS EPEL of singularity 3.8.7, no modules to load.

pull the docker image you want, here ubuntu:18.04

singularity pull docker://ubuntu:18.04

write an sbatch file (here called ubuntu.slurm):

#!/bin/bash
#SBATCH -t 0-00:30
#SBATCH -N 1
#SBATCH -c 1
#SBATCH --cpus-per-task=2   #singularity can use multiple cores
#SBATCH --mem-per-cpu=4000
singularity exec docker://ubuntu:18.04 cat /etc/issue

submit to the queueing system with

sbatch ubuntu.slurm

and when the resources become available, your job will be executed.

On amp nodes (not using GPU)

You need to load the module:

module load amp
module load singularity/3.7.3

pull the docker image you want, here ubuntu:18.04:

singularity pull docker://ubuntu:18.04

write an sbatch file (here called ubuntu.slurm):

#!/bin/bash
#SBATCH -t 0-00:30
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -p gpu
#SBATCH --mem-per-cpu=4000
singularity exec docker://ubuntu:18.04 cat /etc/issue

submit to the queueing system with

sbatch ubuntu.slurm

and when the resources become available, your job will be executed.

On amp nodes (using GPU)

When running singularity through SLURM (srun, sbatch) only GPUs reverved through SLURM are visible to singularity.

Use with

module load amp
module load cuda/11.3
module load Singularity/3.7.3

pull the docker image you want, here ubuntu:18.04:

singularity pull docker://ubuntu:18.04

write an sbatch file (here called ubuntu.slurm):

#!/bin/bash
#SBATCH -t 0-00:30
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -p gpu
#SBATCH --gres=gpu:A100:1     #only use this if your job actually uses GPU
#SBATCH --mem-per-cpu=4000
module load amp
module load cuda/11.3
module load Singularity/3.7.3
singularity exec --nv docker://ubuntu:18.04 cat /etc/issue
# the --nv option to singularity passes the GPU to it

submit to the queueing system with

sbatch ubuntu.slurm

and when the resources become available, your job will be executed.

More on singularity and GPUs, see https://sylabs.io/guides/3.5/user-guide/gpu.html.

Hints

There is no network isolation in Singularity, so there is no need to map any port (-p in docker). If the process inside the container binds to an IP:port, it will be immediately reachable on the host.

Singularity will use all cores reserved using --cpus-per-task, if less should be used, the singularity parameter --cpus can be used, similarly, if a container should use less memory, this can be restricted by the singularity parameter --memory. These parameters can be useful, if a single batch job starts several containers concurrently.

Converting from Docker.io, see

https://www.nas.nasa.gov/hecc/support/kb/converting-docker-images-to-singularity-for-use-on-pleiades_643.html




Example: Interactive TensorFlow job


Start an interactive session on amp, make the modules available and run the docker image in singularity:

srun -t 1:00:00 -p gpu --gres=gpu:A100:1 --pty bash
source /usr/share/lmod/lmod/init/bash
module load amp
module load cuda/11.3
module load Singularity/3.7.3
singularity run --nv docker://tensorflow/tensorflow:latest-gpu

inside the container run

python
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

The following is the “TensorFlow 2 quickstart for beginners” from https://www.tensorflow.org/tutorials/quickstart/beginner, continue inside the python:

import tensorflow as tf
print("TensorFlow version:", tf.__version__)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])
predictions = model(x_train[:1]).numpy()
predictions
tf.nn.softmax(predictions).numpy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
loss_fn(y_train[:1], predictions).numpy()
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test,  y_test, verbose=2)
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])
probability_model(x_test[:5])



Example job for OpenDroneMap (ODM)


OpenDroneMap needs a writable directory for the data. This directory needs to contain a subdirectory named images.

Assume you keep your ODM projects in the directory opendronemap:

opendronemap
|
|-Laagna-2021
| |
| |-images
|
|-Paldiski-2015
| |
| |-images
|
|-Paldiski-2018
| |
| |-images
|
|-TalTech-2015
| |
| |-images

If you want to create a 3D model for Laagna-2021, you would run the following Singularity command:

singularity run --bind $(pwd)/opendronemap/Laagna-2021:/datasets/code docker://opendronemap/odm --project-path /datasets

For creating a DEM, you would need to add --dsm and potentially -v "$(pwd)/odm_dem:/code/odm_dem"

GPU use for singularity is enabled with the --nv switch, be aware that ODM uses the GPU only for the matching, which is only a small percentage of the time of the whole computation.

The SLURM job-script looks like this:

#!/bin/bash
#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task=10
#SBATCH --time 01:30:00
#SBATCH --partition gpu
#SBATCH --gres=gpu:A100:1

module load amp
module load Singularity/3.7.3

singularity run --nv --bind $(pwd)/opendronemap/Laagna-2021:/datasets/code docker://opendronemap/odm --project-path /datasets --dsm