Monitoring Jobs
๐ฅ๏ธ Slurm Job Monitoring Cheat Sheet
Command | Description | Example |
---|---|---|
squeue |
View jobs in the queue (running & pending) | squeue -u $USER โ Show only your jobs |
scontrol show job <jobid> |
Detailed info about a job (state, reason, resources) | scontrol show job 12345 |
sstat -j <jobid> |
Runtime stats of a running job (CPU, memory, I/O) | sstat -j 12345 |
sacct |
Show job history and accounting data (running, finished, failed) | sacct -u $USER --format=JobID,State,Elapsed,MaxRSS |
sinfo |
View partitions and node availability | sinfo -s โ Summary view |
scontrol show node <nodename> |
Detailed info about a specific compute node | scontrol show node cnode01 |
seff <jobid> |
Job efficiency (CPU/memory usage vs request, if installed) | seff 12345 |
โ Typical workflow for checking your jobs:
squeue -u $USER # See queued & running jobs
sacct -u $USER # See finished jobs
scontrol show job <jobid> # Detailed info
sstat -j <jobid> # Runtime resource usage