UTEP Paro Quick User Guide

Paro is the current high performance computing (HPC) environment and available for all purposes of computational processing. The system is expanding as more requirements for hpc resources arise. The environment is built on heterogeneous system architectures, the latest software architecture as well as storage technology and meets almost any requirement for science, engineering, health science, artificial intelligence and more.

Quick User Guide

Important

Make sure your computer is connected to the UTEP network either on campus or through VPN form off campus.

To connect to Paro, we’ll use the ssh command:

#change username with your Paro username
ssh username@paro.utep.edu
  • If you are using Windows, you can use clients like MobaXTerm, Putty or in the terminal with Powershell.
  • If you are using a Mac or Linux, you will already have the ssh command available in your terminal.

After login your home directory is the default current working directory.

Compiling

To compile executables in your home directory first load the required environment modules for compilers and libraries.

  • List all available modules:
module spider
  • Load and use OpenMPI4:
source /opt/intel/oneapi/setvars.sh -ofi_internal=1 --force
module load gnu12 openmpi4
export I_MPI_OFI_PROVIDER=tcp

In this example, the environment variables are set for compiling executables with the GNU version 12 compilers, and the wrappers and libraries are loaded to compile with OpenMPI4. Then compile your executable.

Submit a job to the cluster

To run a job on a high-performance computing (HPC) cluster, you need to prepare a submit script. This script tells the cluster what resources your job needs and what commands to run. Here's an example of a submit script:

#!/bin/bash
#SBATCH -n 4
#SBATCH -p general
#SBATCH -o output.txt
#SBATCH -e error.txt
source /opt/intel/oneapi/setvars.sh -ofi_internal=1 --force
module load gnu12 openmpi4
export I_MPI_OFI_PROVIDER=tcp
mpirun a.out -np 4
  • Explanation:
  • #!/bin/bash: Specifies the shell.
  • #SBATCH -n 4: This requests 4 CPU cores for the job.
  • #SBATCH -p general: This specifies the partition or queue to use.
  • #SBATCH -o output.txt: Saves the output (results) of your job to a file named output.txt
  • #SBATCH -e error.txt: Saves any error messages to a file named error.txt
  • The next lines (6-8) loads all libraries needed for runtime. In this case gnu12, openmpi4
  • The last line(9) the actual command to execute. In this example it start a mpi executable with 4 process in parallel.

To submit the job use this command (the submit script can have any name):  

sbatch < submitscript.sh

Slurm Submit

Check job status

To check the status of the jobs in the cluster, you can use the squeue command. This command provides a comprehensive overview of all running or pending jobs.

squeue

Slurm Squeue

You can also use various options with squeue to filter and format the output according to your needs. For example:

  • Filter by user
squeue -u username
  • Filter by job state
squeue -t RUNNING

Cancel a job

A running or pending job can be cancel with the command scancel \<jobid>. Use the squeue command to find the job id and cancel the job with

scancel <jobid>

Create ssh keys if required by mpi

Mpi uses ssh as the communication protocol between process. In most cases it is required to establish ssh connection between processes without passwords. In order to allow password- less ssh connections the user has create encrypted keys.

Create the keys and configure your environment with the following commands from your home directory.

ssh-keygen -t rsa

Caution

Follow the instruction of the command and leave all settings as recommended defaults. Do not enter a passphrase.

After the key has been created, follow these steps:

cd $HOME/.ssh
cp id rsa.pub authorized keys
chmod 700 $HOME/.ssh
chmod 600 $HOME/.ssh/authorized_keys

Note

Make sure the permission mode of the .ssh directory is set to 700, which allows only the user to read and write.

The files in the .ssh directory must have these permissions:

-rw-r--r-- authorized_keys
<p>-rw------- id_rsa
<p>-rw-r--r-- id_rsa.pub
<p>-rw-r--r-- known hosts

To avoid any prompts or outputs when the ssh connections are made, add a file to the .ssh directory with the name "config" and add these lines to the file:

Host *
    StrictHostKeyChecking no
    UserKnownHostsFile /dev/null
    LogLeveI QUIET

Interactive shell sessions

Slurm allows users to start an interactive shell on the system after requesting resources, similar to submitting a batch job. This interactive session can also be used to run parallel MPI processes with the number of requested cores. The benefit of an interactive session is that users can execute commands manually and interactively, just like in a Linux terminal.

To start an interactive session, use a command like this:

srun -n 4 -p general --pty bash -l

In this example, the command requests an interactive bash shell with four cores on the general partition.

Explanation: * srun: The Slurm command to run a job. * -n 4: Requests four tasks (cores). * -p general: Specifies the partition (queue) named "general". * --pty: Allocates a pseudo-terminal for the interactive session. * bash -l: Starts a login shell.

Like a submitted batch job, an interactive shell provides the necessary resources for your commands to execute. This is particularly useful for testing, debugging, or running commands that require user interaction

Slurm Srun

Regulatory Guidelines for the Cluster Usage

To ensure fair and efficient use of the Slurm cluster resources, please adhere to the following guidelines:

  • Stay within Assigned Partitions: Users must submit jobs only to the partitions they are authorized to use. Unauthorized use of partitions can lead to job termination and potential access restrictions.

  • General Partition: Users are limited to a maximum of 40 cores (one node) on the general partition. This ensures equitable access to resources for all users.

  • Resource Requests: Always request only the necessary resources for your job. Over-requesting resources can lead to inefficient use of the cluster and longer wait times for all users.

  • Job Monitoring: Regularly monitor your jobs using the squeue command to ensure they are running as expected and within the allocated resources.

Warning

Adherence to these guidelines is mandatory. Non-compliance may result in jobs termination and further administrative actions.

Support

I you have any question or need support, please open a ticket in the Service Desk:

Tip

Need extra resources? Reach out to the team radcadmin@utep.edu and ask how to become an investor.