| Queues: |
- Timeshared Queues: (queue name syntax: s,m,l= short,med,large t,j=time,job )
| maxMem. | -5h CPU h per proc | -50h CPU
h per proc | 50-200h CPU hs per
proc | 200-400h CPU hours per proc |
| -2 Gb |
-8PEs vst_sj |
-8PEs st_sj |
-8PEs mt_sj |
-8PEs lt_sj |
| 2-4 Gb |
9-16PEs vst_mj |
9-16PEs st_mj |
9-16PEs mt_mj |
9-16PEs lt_mj |
| 4-8 Gb |
17-64PEs vst_lj |
17-64PEs st_lj |
17-64PEs mt_lj |
17-64PEs lt_lj |
- Timeshared Queues: (queue name syntax: s,m,l= short,med,large t,j=time,job )
| maxMem. | # of PEs / Queue name | hours (wall-clock) |
| -64 Gb |
128 PEs 128_ded_short |
1:00h |
| -64 Gb |
128 PEs 128_ded_med |
15:00h |
| -64 Gb |
128 PEs 128_ded_long |
50:00h (weekend only) |
- Interactive: 16 PEs per user, 256 MB memory limit, 15 CPUminutes
time limit
- Important:
- The CPU-time limit in the table has to be divided by the number PEs to
get the "wall clock time" limit.
- For the qs2 script, you specify the
total cpu-time if you
run in the normal queues (0-64 PEs). Multiply the per proc CPU
hours in the table with number of procs you requested!
- You specify wall-clock time
for the dedicated queues (128 PEs) and the qs2script.
Dedicated queues are tough on the allocation quota.
- qs2: Since the jobs are distributed to different machines, who's
/scratch partition is writeable for the batch system only, the
qs2 script uses the following mechanism to spool jobs:
- the script generates a directory on the permanent storage
facility, named basedir_outdir, where basedir and outdir are
the Cactus parameters ("nameofparfile" is expanded), it mimicks the
directory hierrachy by underscores, since ftp cannot generate multiple
dirs.
- the executable and the parameterfile are transferred to this
directory. The exe is renamed to outdir_exe.
- at execution time, these files are transferred to the local
scratch and are executed.
- after execution, 1D,2D,3D and checkpoint files are tarred up independently and
moved to the permanent storage directory.
- NOTE: qs2 will only look in the outdir directory, if you put your checkpoint files
someplace else, you need to modify the submission script.
- Note: since this mechanism involves spooling of the
submitted job, changes to the local parameter file will have no
affect. If you don't want to loose you position in the queue, change
the files in the perm.storage directory.
|
| Job submission: |
- Running Jobs on the Origin
- qs2 16 bhole.par 80:00 512M [optional args ...]
will submit a
job requesting 16 PEs, 512 MB of memory and 80 hours of CPU time for
all procs. Since this is 5:00 hours per process, you end up in the
vst_sj queue, which is ideal for debugging, etc.
- qs2 128 bhole.par 15:00 1G [optional args ...]
submits a job to the 128 PEs dedicated queue runtime 15:00h
wall-clock, reqeusting 1 Gigbyte of memory.
- busage gives job resource statistic on running and
completed jobs.
- bjobs displays the status of jobs, queues, and the
system.
- bpeekjobID# gives you the output written by the job.
- bkilljobID# deletes a job from the queue.
- bqueues [-l name] displays queue information
- HINT: to see if your jobs starts of properly request cpu
time of about a minute and less equal 9 procs, and your jobs will
be served nearly instantenously in a special debug queue.
|