Working with the cluster

Working with the cluster can be divided into four stages.

  1. Logging into the cluster.
  2. Queuing a job (script).
  3. Executing and monitoring a job.
  4. Receiving the results.

A job or task is a simulation or program code that is executed by a user on the cluster.

User workspace

Each user has its own workspace where the files related to the jobs can be stored. By accessing the login node command line, user automatically is directed to its working directory.

/home/username

The file system between cluster nodes and login node is shared, therefore no copying of files to or from the execution node and between the nodes is necessary. It means that a file copied on the login node can be accessed from all other nodes as well, and all changes made to the files on the login node are also shown on other nodes.

A user can also install / compile applications that can be used on a cluster in its working directory, if the installation does not require administrator’s (root) rights.

Preparing the environment (modules)

To prepare the cluster for using various applications, compilers, libraries, modules can be used (Environment Modules). Modules are loaded using a command.

module load module_name

Command sets a path to the respective application directory and other environment variables that are necessary. To acquire the list of all modules (applications and tools), execute:

module avail

The tools included in Linux distribution do not require module loading normally.

A module must be loaded on the node where the program is launched. If a job is performed on computing node(s), the module must also be loaded on each of them.

To restart an environment:

module unload module_name

More information about environmental modules: http://modules.sourceforge.net/

Where a job is running?

Usually there are three components involved when working on a cluster:

  • user’s computer;
  • cluster access node (ui-1.hpc.rtu.lv);
  • computing node (e.g. wn02).

Accessing the cluster login node and executing a command on the terminal (or opening of the application GUI) does not mean that a job will be automatically performed on computing nodes. Most probably, it will overload the login node.

Queuing a job

Before a job gets to a computing node and its execution starts, it is placed in a virtual queue. The queue organises resource allocation in a multi-user system where the number of jobs and their requirements may exceed the amount of available resources (CPU, memory). When resources become available again, the job waiting in the queue longer will be executed next, as a rule. Users do not have to monitor the resource availability, as job movement in the queue and execution is automatic. If there is no queue (meaning that the resources are available), job is started immediately. There are several queues on the RTU cluster, which differ by the job duration and amount of available resources:

  • batch;
  • fast;
  • long.

Detailed description of queues is available in the section Job queues.

Simple job

Jobs are queued using special Torque/Moab cluster user’s tools.

Command for queuing a simple job (script):

qsub test.sh

test.sh is a bash (Linux command line) script in which a user enters consecutive commands to be executed when the job gets to a computing node. It ensures execution of a batch without user’s participation. A script may include such command, for example:

#!/bin/bash
echo “Hello world from node `/bin/hostname`”

The command prints out the name of a computing node. It can also be executed locally.

Example of a script to launch your program with parameters on the computing node:

#!/bin/bash

./myprogram 12 3

Other useful information available here:

/opt/exp_soft/user_info
Interactive job

Alternatively, an interactive job can be used instead of batch. The interactive mode is convenient for testing and debugging jobs or in case graphical tools are used. Start an interactive job:

qsub –I

Automatically a remote terminal on a computing node will be opened where a user can execute the needed commands by writing them in the command line. The command is similar to “ssh wn[xx]”, with the difference that resources are reserved and there will not be any conflicts with other users.

If it is necessary to open a graphical window in an interactive regime, add -X parameter.

qsub –X –I

More information available here: Graphic tools.

Job parameters and requirements

Users can indicate the job parameters and requirements, like the name of the queue to be used or the time necessary for the job. This information will be used to find the most suitable resources for the job.

qsub –N my_job –q fast –l walltime=00:00:30 test.sh

Uzdevuma prasības var pievienot palaišanas skripta sākumā:

Commentary. Parameters can be entered in either the command line or the script, but highest priority is given to the parameters in the command line in case they repeat.

How to request specific computing resources? Define the requirements with qsub -l and indicate these parameters:

  • number of nodes and cores
-l nodes=1:ppn=12
  • particular node (may result in more queue time)
-l nodes=wn44:ppn=12
  • number of GPUs
-l nodes=1:ppn=12:gpus=2
  • necessary memory amount
-l nodes=1:ppn=12,pmem=1g 

memory amount per core (processor)

-l nodes=1:ppn=12,mem=12g

total memory amount for a job

  • Require computing nodes with particular features. The features are usually used on clusters with non-homogeneous nodes.
-l nodes=1:ppn=12,mem=12g,feature=centos7

Hostnames, parameters and features are provided in the section List of Nodes.

Find out more about the qsub command and parameters by executing:

man qsub
Parallel (MPI) job

A job is divided between several cores in a node or between cluster nodes using Message Passing Interface (MPI) protocol.

Queuing a parallel job requiring 24 cores (2 nodes × 12 cores in each node):

qsub -l nodes=2:ppn=12 run_mpi.sh

or by not indicating a specific number of nodes

qsub -l procs=24 run_mpi.sh

An example of run_mpi.sh script:

MPI examples available here: /opt/exp_soft/users_info/mpi

Uzdevuma mainīgie

Uzdevuma skriptā parocīgi izmantot mainīgos, kuri uzstādās automātiski uzdevumam nonākot uz skaitļošanas mezgla.

$PBS_O_WORKDIR
$PBS_NODEFILE
$PBS_GPUFILE	
$PBS_NP
$ PBS_JOBID

Piemēram, lai pārietu uz direktoriju, no kuras uzdevums tika ievietots rindā:

cd $PBS_O_WORKDIR

Lai iegūtu sarastu ar visiem mainīgajiem, sāciet interaktīvu uzdevumu (qsub –I) un izpildiet komandu:

env | grep PBS
Uzdevuma pārtraukšana

Izpildē esošu vai rindā gaidošu uzdevumu patrauc ar komandu:

qdel job_id

vai

canceljob job_id

job_id – uzdevumu identifikators.

Pārtraukt visus lietotāja uzdevumus:

qdel ‘all’
Rindas, mezglu un uzdevumu monitorings

Komanda, lai pārbaudītu rindā ievietoto uzdevumu izpildes gaitu:

qstat

R – running, C – completed, Q – queued

vai lai redzētu kopējo uzdevumu rindu (par visiem lietotājiem):

showq

Komanda, lai noskaidrotu pieejamos skaitļošanas resursus:

showbf

vai detalizētākai informācijai:

pbsnodes

Lai iegūtu detalizētu informāciju par uzdevuma izpildi, kā arī iemesliem uzdevuma aizturēšanai rindā:

checkjob job_id -vvv
Uzdevuma izpildes efektivitāte

CPU izmantošanas efektivitāte izpildē esošiem uzdevumiem.

showq –r

skat. kolonu EFFIC

Pārbaudīt  uzdevuma radīto sistēmas noslodzi un efektivitāti skaitļošanas mezglā.

1. Noskaidrot, kurā mezglā uzdevums izpildās

qstat -n job_id

2. Attālināti pieslēgties attiecīgajam mezglam, piemēram, node10

ssh wn01

3. Izmantot Linux rīkus monitoringam:

htop
nvidia-smi
iostat 
nfsstat
  • CPU izmantošanas efektivitātei ar komandu htop
  • Sekot GPU izmantošanas efektivitātei