Tips, Help & Wrap-Up

University of Exeter logo

Tips, Help & Wrap-Up

Slurm cheatsheet, troubleshooting,
and where to go from here

Resources

Section 7 — 15 min

GW4 logo

Isambard 3 exterior

Useful sbatch directives

Directives to explore on your own — man sbatch is the full reference

Directive Purpose
--mem Total memory for the job
--mem-per-cpu Memory per allocated CPU core
--cpus-per-task CPU cores per task (threads)
--ntasks-per-node Number of tasks on each node

Other useful man pages:

man srun · man squeue · man scancel · man sacct · man scontrol · man sacctmgr · man sinfo

All are also available online at https://slurm.schedmd.com.

Separate stdout and stderr

Useful for clean parseable output — but keep them merged while debugging

#SBATCH --output=hello_world.out
#SBATCH --error=hello_world.err

By default Slurm merges both streams into --output. Splitting them gives you clean output you can parse or post-process.

Trade-off: you lose the interleaving that shows how stdout and stderr relate in time. Keep them merged while debugging, split them when you need machine-readable output.

Check CPU usage of completed jobs

sacct tells you how efficiently your job used its allocation

sacct -j ${SLURM_JOB_ID} \
  --format=JobID,JobName%15,Elapsed,TotalCPU,NCPUS,CPUTime,AveCPU,MaxRSS
Column Meaning
TotalCPU User + sys CPU time summed over all tasks
Elapsed Wall-clock time
NCPUS Cores allocated
CPUTime Elapsed × NCPUS — the budget you could have used
MaxRSS Peak memory usage

Utilisation = TotalCPU / CPUTime. A healthy run is near 1.0.

If TotalCPU > NCPUS × Elapsed, threads are fighting for cores — you have allocated fewer cores than the job actually uses.

Slurm cheatsheet

Commands worth bookmarking — reference, not memorisation

# Partitions and limits
sinfo
sinfo -o "%P %a %D %c %m %G %l %N"
scontrol show partition

# Node details
scontrol show node x3008c0s15b2n0

# Overall config
scontrol show config
# QOS and accounting
sacctmgr show qos
sacctmgr show qos grace_qos
sacctmgr show user ${USER} withassoc
sacctmgr show assoc user=${USER}

# Your jobs
squeue --me
sacct

File transfer

rsync is the universal workhorse

# Push a directory to Isambard
rsync -avz my_project/ e6c.3.isambard:${PROJECTDIR}/my_project/

# Pull results back
rsync -avz e6c.3.isambard:${SCRATCHDIR}/results/ ./results/
  • -a archive mode (preserves permissions, timestamps, symlinks)
  • -v verbose
  • -z compress during transfer

rsync only transfers changed files on subsequent runs — safe to re-run after interrupted transfers.

Troubleshooting: terminal compatibility

Garbled output on SSH? Your terminal might be setting an unknown $TERM

Some terminal emulators (e.g. Ghostty, kitty) set a $TERM value the remote system does not recognise.

Fix: override $TERM for the SSH session:

TERM=xterm-256color command ssh e6c.3.isambard

Or add to your SSH config:

Host e6c.3.isambard
    SetEnv TERM=xterm-256color

Where to look next

Isambard 3 docs

https://docs.isambard.ac.uk

System guides, storage, Slurm, software, containers.

Slurm docs

https://slurm.schedmd.com/documentation.html

Full reference for every command and directive.

BriCS helpdesk

https://support.isambard.ac.uk/

System issues, accounts, allocations.

Getting help

UoE RSE support (workshop follow-up, usage questions):

isambard-support@exeter.ac.uk

BriCS helpdesk (system issues, accounts, allocations):

https://support.isambard.ac.uk/

Docs QR code
Isambard 3 documentation

Questions?

Q & A

Questions, comments, or things you want to try next?

Feedback

We will send a short feedback survey by email after the workshop.

Please fill it in — your responses directly shape future sessions.

Thank you for attending!