- How do I login to Tusker?
- How do I maximize file performance for /work on Tusker?
- What is the Maui Scheduler?
- What is module and how do I use it?
- How do I create a script for a Serial job?
- Why can't my job write anything to my home directory?
- Where should I store my data?
- How can I run a high memory job?
- How do I create a script for a parallel job?
- How do I submit a job to the scheduler?
- Now that I've submitted my job, how do I check the status?
- How do I kill a job?
- How do I compile an MPI program?
- My job requires me to use an entire node or nodes exclusively. How do I request this?
- What are the complexity requirements when changing my password?
Using any ssh client, ssh to tusker.unl.edu.
TopHow do I maximize file performance for /work on Tusker?
- Parallel file systems suffer poor performance with numerous small files
- Faster performance for large files (>100MB) stored in /work/group/username/parallel
Tusker has a parallel distributed file system called Lustre backing the /work directories. Lustre distributes files across various back-end storage arrays in the file system. By default, each newly created file is stored in its entirety on a single back-end storage array. Lustre can also be configured on a per file or directory basis to spread file contents across the back-end storage arrays. This can increase the file access performance with certain types of workloads and file sizes.
By default, the /work/group/username directory is configured so new files and directories that are created will cause all the files to be stored on a single back-end storage array. This should work best for files that are small, less than 100MB, and have low to moderate concurrent access from the processes in your job.
The /work/group/username/parallel directory is configured so new files are striped across the various back-end storage arrays. Use this directory for large files only. Small files will suffer a significant performance degradation under this path. Large files will have greater read/write parallelism across the back-end storage arrays.
TopMaui is a cluster scheduler similiar to Condor, PBS, and SGE. Maui uses Torque (a variant of PBS) to execute jobs on remote nodes. All jobs to be run on the cluster must be submitted to the Maui scheduler for execution.
TopWhat is module and how do I use it?
Module is available for use on HCC machines. The module software simplifies the use of different compilers and versions by setting the environment for each with the use of a single command.
To see the list of available modules, run the command module avail.
To use a particular module, run module load modulename. For example, to use the 9.0-3 version of the PGI compiler suite, run module load pgi/9.0-3. To unload a module, run module unload modulename. To see the currently loaded module(s), run module list.
Switching modules may be done by either first unloading the old and then loading the new module, or running module switch oldmodule newmodule.
To see a complete list of module commands/options, run module help.
Please note that if you compile your application using a particular module, you must include the appropriate module load statement in your submit script.
TopHow do I create a script for a Serial job?
The most common way to submit a job is with a submit script.
A sample submit script would look like:
#PBS -N TestJob
#PBS -l select=1
#PBS -l walltime=00:01:00
#PBS -o TestJob.stdout
#PBS -e TestJob.stderr
cd $PBS_O_WORKDIR
sleep 10
NOTICE: PBS/Torque directives start with a #PBS.
NOTICE: If you do not put in a realistic walltime (accurate to within 3 days), it can severely penalize your queue priority.
#!/bin/sh - This tells the computer that this is an executable script, and it should be executed with /bin/sh interpreter. This could be /bin/bash (or other shells), or even /usr/bin/python.
#PBS -N TestJob - This tells PBS that the name of this job is 'TestJob'
#PBS -l select=1 - All requests for resources start with the -l command. This tells Maui that we want 1 processor for the execution of this job.
#PBS -l walltime=00:01:00 - Again, this is a request for resources. This line says that we want 1 Minute of run time.
#PBS -o TestJob.stdout - This says that we want to redirect standard output to the file TestJob.stdout.
#PBS -e TestJob.stderr - Redirect standard error to TestJob.stderr.
cd $PBS_O_WORKDIR - The $PBS_O_WORKDIR environement variable is set to the directory that you submitted the job from. Without this statement, you will start at your home directory, or ~/.
sleep 10 - For this example, we only sleep for 10 seconds. But this could be a command to run an executable, or any regular command that can be executed.
TopWhy can't my job write anything to my home directory?
The /home directories are read-only from the worker nodes. As they are not an area intended for active job I/O, this is done to prevent overwhelming the /home storage system while maintaining the ability for jobs to use binaries, config files etc. located there. Please use your corresponding /work directory for output from active jobs.
Please see the FAQ entry "Where should I store my data?" for further information about the difference between the /home and /work filesystems and the intended uses of each.
All HCC machines have two separate areas for every user to store data, each intended for a different purpose.
Your home directory (i.e. /home/[group]/[username]) is meant for items that take up relatively small amounts of space. For example: source code, program binaries, configuration files, etc. This space is quota-limited on a per-group basis. The home directories are backed up for the purposes of best-effort disaster recovery. This space is not intended as an area for I/O to active jobs.
Every user has a corresponding directory under /work using the same naming convention as /home (i.e. /work/[group]/[username]). We encourage all users to use this space for I/O to running jobs. This directory can also be used when larger amounts of space are temporarily needed. It is not quota-limited; however space in /work is shared among all users. It should be treated as short-term scratch space, and is not backed up. HCC reserves the right to delete data from this area when space becomes low; whenever the situation allows, users will be notified before this occurs and asked to voluntarily clear space.
If you have space requirements outside what is currently provided, please email hcc-support@unl.edu and we will gladly discuss alternatives.
TopHow can I run a high memory job?
If your job requires more than 250GB of memory on a single worker node, Tusker has a high memory queue available. The two highmem worker nodes have 512GB RAM.
This syntax submits your job to the high memory queue:
You can also specify the highmem queue inside your submit script:
How do I create a script for a parallel job?
Similiar to the serial job above, a parallel job is submitted using a script.
#PBS -N MPI.Job
#PBS -l select=10
#PBS -l walltime=00:01:00
#PBS -o mpi.stdout
#PBS -e mpi.stderr
#PBS -V
module load openmpi-1.3.3/gcc-4.1.2
cd $PBS_O_WORKDIR
NPROCS=`wc -l < $PBS_NODEFILE`
mpirun -n $NPROCS -machinefile $PBS_NODEFILE ./a.out
This time, lets look at the lines that we added/changed from the serial job above.
#PBS -l select=10 - This tells the scheduler to reserve 10 processors for execution of the job.
module load openmpi-1.3.3/gcc-4.1.2 - This loads up the module environment for your job (see above).
cd $PBS_O_WORKDIR - The $PBS_O_WORKDIR environement variable is set to the directory that you submitted the job from. Without this statement, you will start at your home directory, or ~/.
NPROCS=`wc -l < $PBS_NODEFILE` - This line counts the number of hosts in the nodefile to send to the mpirun exectuable.
mpirun -n $NPROCS -machinefile $PBS_NODEFILE ./a.out - This starts the mpijob 'a.out' running on $NPROCS from the machines listed in $PBS_NODEFILE.
TopHow do I submit a job to the scheduler?
First, you must create a submission script. Then the command 'qsub <script_name> ' will submit the job.
TopNow that I've submitted my job, how do I check the status?
To check the status of your job, use the 'qstat' command.
TopTo kill a job, use 'qdel <JobID> '.
TopHow do I compile an MPI program?
First, login and use the module command to select the version of OpenMPI you wish to use. For example, module load openmpi-1.3.3/gcc-4.1.2. (See the above section for more information on using module.)
The MPI binaries and libraries will now be available in your environment.
Please note that you will need to include the appropriate module load statement in your submit script.
My job requires me to use an entire node or nodes exclusively. How do I request this?
There are times your job may require exclusive access to a node or nodes.
To request exclusive access, use:
#PBS -l nodes=N:ppn=64
where N is the number of nodes your job requires.
What are the complexity requirements when changing my password?
New passwords must be at least 9 characters long, and contain at least one number, one capital letter, and one symbol or punctuation mark.
Top



