This page describes a complete example of submitting an HTCondor job.
SSH to Crane
[apple@localhost]ssh apple@crane.unl.edu
[apple@login.crane~]$
Write a simple python program in a file “hello.py” that we wish to run using HTCondor
[apple@login.crane ~]$ vim hello.py
Then in the edit window, please input the code below:
#!/usr/bin/env python
import sys
import time
i=1
while i<=6:
print i
i+=1
time.sleep(1)
print 2**8
print "hello world received argument = " +sys.argv[1]
This program will print 1 through 6 on stdout, then print the number
256, and finally print hello world received argument = <Command
Line Argument Sent to the hello.py>
Write an HTCondor submit script named “hello.submit”
Universe = vanilla
Executable = hello.py
Output = OUTPUT/hello.out.$(Cluster).$(Process).txt
Error = OUTPUT/hello.error.$(Cluster).$(Process).txt
Log = OUTPUT/hello.log.$(Cluster).$(Process).txt
notification = Never
Arguments = $(Process)
PeriodicRelease = ((JobStatus==5) && (CurentTime - EnteredCurrentStatus) > 30)
OnExitRemove = (ExitStatus == 0)
Queue 4
Create an OUTPUT directory to receive all output files that generated by your job (OUTPUT folder is used in the submit script above )
[apple@login.crane ~]$ mkdir OUTPUT
Submit your job
[apple@login.crane ~]$ condor_submit hello.submit
Submitting job(s)
....
4 job(s) submitted to cluster 1013054.
Check status of condor_q
[apple@login.crane ~]$ condor_q
condor_q
-- Schedd: login.crane.hcc.unl.edu : <129.93.227.113:9619?...
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
720587.0 logan 12/15 10:48 33+14:41:17 H 0 0.0 continuous.cron 20
720588.0 logan 12/15 10:48 200+02:40:08 H 0 0.0 checkprogress.cron
1012864.0 jthiltge 2/15 16:48 0+00:00:00 H 0 0.0 test.sh
1013054.0 jennyshao 4/3 17:58 0+00:00:00 R 0 0.0 hello.py 0
1013054.1 jennyshao 4/3 17:58 0+00:00:00 R 0 0.0 hello.py 1
1013054.2 jennyshao 4/3 17:58 0+00:00:00 I 0 0.0 hello.py 2
1013054.3 jennyshao 4/3 17:58 0+00:00:00 I 0 0.0 hello.py 3
7 jobs; 0 completed, 0 removed, 0 idle, 4 running, 3 held, 0 suspended
Listed below are the three status of the jobs as observed in the above output
Symbol | Representation |
---|---|
H | Held |
R | Running |
I | Idle and waiting |
Explanation of the $(Cluster)
and $(Process)
in HTCondor script
$(Cluster)
and $(Process)
are variables that are available in the
variable name space in the HTCondor script. $(Cluster)
means the
prefix of your job ID and $(Process)
varies from 0
through number of
jobs called with Queue - 1
. If your job is a single job, then
$(Cluster) =
<job ID>
else, your job ID is combined with $(Cluster)
and
$(Process)
.
In this example, $(Cluster)
=“1013054” and $(Process)
varies from “0”
to “3” for the above HTCondor script.
In majority of the cases one will use these variables for modifying
the behavior of each individual task of the HTCondor submission, for
example one may vary the input/output file/parameters for the run
program. In this example we are simply passing the $(Process)
as
arguments as sys.argv[1]
in hello.py
.
The lines of interest for this discussion from file the HTCondor
script “hello.submit” are listed below in the code section :
$(Process)
Output= hello.out.$(Cluster).$(Process).txt
Arguments = $(Process)
Queue 4
The line of interest for this discussion from file “hello.py” is listed in the code section below:
$(Process)
print "hello world received argument = " +sys.argv[1]
Viewing the results of your job
After your job is completed you may use Linux “cat” or “vim” command to view the job output.
For example in the file hello.out.1013054.2.txt
, “1013054” means
$(Cluster)
, and “2” means $(Process)
the output looks like.
example of one output file “hello.out.1013054.2.txt”
hello.out.1013054.2.txt
1
2
3
4
5
6
256
hello world received argument = 2
Please see the link below for one more example:
http://research.cs.wisc.edu/htcondor/tutorials/intl-grid-school-3/submit_first.html