Your home directory (/home/username) on the cluster is independent from your normal /homes/gws account. This local home directory is not backed up. Please make backups of important code and documents elsewhere. This directory is physically located on the node 'hail'.
It is possible to get access to your /homes/gws account. You need to email CSE Support asking them to export your home directory to the machine hail. Once you get confirmation from Support that the export has been done, try 'ls /homes/gws/username. If this doesn't work, email CSE Support asking that the maps be updated.
Exporting home directories or any other NFS points to hail is considered a security risk by the folks that tend most of the department machines. There is a chance that hail would be compromised which would give them access to said NFS areas. Hail is kept up to date with patches to prevent such breaches. |
Once your gws account has been exported, you may find it convenient to make a sym link in your local hme directory that points to your normal homedir. You can do this by running the cmmand 'ln -s /homes/gws/<username> gws'. This will create a new virtual directory called 'gws' in your hail home directory. You can now access anything in your gws homedir by cd'ing into ~/gws.
Each compute node has /scratch. Those nodes with two disks also have
/scratch2. These areas are free for anyone to use. Please make a
subdirectory under /scratch to contain your work (eg. /scratch/myname).
Any directories left untouched for 6 months are eligible for purging unless
arrangements are made. This space is not appropriate for general backups
of your data.
Use /tmp for truely temporary (during execution, small data sets) needs,
use /scratch or /scratch2 for longer-term scratch storage (ie, length
of project)
Your local (/home/username) homedir is also appropriate for some
longer-term data storage. Anything that needs to be backed up must be
stored either in /projects or your gws home directory. Nothing
on the hail cluster is backed up.
The most straight forward way to execute a job under torque is with a command like:
qsub -N MyJobName /full/path/to/progThis will return text like "123.master", which is the job ID and host that the job was submitted to. Note that you cannot pass any command line arguments into the program.
If you are getting an error 'cannot execute binary file', try submitting a job using a description file as described next.
Alternatively, you can write a job description file. The major advantage of a description file is that it allows you to run multiple executables as part of the same job. A description file that is equivalent to the one liner above is:
## My comments follow double hashes ## command line flags for the qsub command are given ## in this file starting with '# PBS' # PBS -N MyJobName /full/path/to/prog
This is then queued by 'qrun MyJobDescFile'.
A job defaults to using only one processor on one node. To request two processors on one node, use the qsub flag '-l nodes=1:ppn=2'. For two nodes and one CPU per node '-l nodes=2:ppn1'. ppn stands for 'Processors per Node'.
The -l flag is also used to specify other properties of a node. Currently the following properties are defined:
You can request a specific node by using '-l hosts=nodename'.
Read the qsub manpage for other flags.
Shouldn't be
necessary, except in unusual cases (ie: debugging batch processes).
Use 'rsh nodename' where nodename is n01-n36. rsh should be used only
when working on the cluster.
Use ssh/scp when communicating with non-hail nodes.
Last updated May 23, 2005.