Batch Jobs

Running batch jobs

Basic example

Kubernetes has a support for running batch jobs. A Job is a daemon which watches your pod and makes sure it exited with exit status 0. If it did not for any reason, it will be restarted up to backoffLimit number of times.

Since jobs in Nautilus are not limited in runtime, you can only run jobs with meaningful command field. Running in manual mode (sleep infinity command and manual start of computation) is prohibited.

Let’s run a simple job and get its result.

Create a job.yaml file and submit:

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
        resources:
           limits:
             memory: 200Mi
             cpu: 1
           requests:
             memory: 50Mi
             cpu: 50m
      restartPolicy: Never
  backoffLimit: 4

Explore what’s running:

kubectl get jobs
kubectl get pods

When the job is finished, your pod will stay in Completed state, and Job will have COMPLETIONS field 1 / 1. For long jobs, the pods can have Error, Evicted, and other states until they finish properly or backoffLimit is exhausted.

This example job did not use any storage and outputted the result to STDOUT, which can be seen as our pod logs:

kubectl logs pi-<hash>

The pod and job will remain for you to come and look at for ttlSecondsAfterFinished=604800 seconds (1 week) by default, and you can adjust this value in your job definition if desired.

Please make sure you did not leave any pods and jobs behind. To delete the job, run

kubectl delete job pi

Running several bash commands

You can group several commands, and use pipes, like this:

  command:
    - sh
    - -c
    - "cd /home/user/my_folder && apt-get install -y wget && wget pull some_file && do something else"

Logs

All stdout and stderr outputs from the script will be preserved and accessible by running

kubectl logs pod_name

Output from initContainer can be seen with

kubectl logs pod_name -c init-clone-repo

To see logs in real time do:

kubectl logs -f pod_name

The pod will remain in Completed state until you delete it or timeout is passed.

Retries

The backoffLimit field specifies how many times your pod will run in case the exit status of your script is not 0 or if pod was terminated for a different reason (for example a node was rebooted). It’s a good idea to have it more than 0.

Fair queueing

There is no fair queue implemented on Nautilus. If you submit 1000 jobs, you block all other users from submitting in the cluster.

To limit your submission to a fair portion of the cluster, refer to this guide. Make sure to use a deployment and persistent storage for Redis pod. Here’s our example

CPU only jobs

Nautilus is primarily used for GPU jobs. While it’s possible to run large CPU-only jobs, you have to take certain measures to prevent taking over all cluster resources.

You can run the jobs with lower priority and allow other jobs to preempt yours. This way you should not worry about the size of your jobs and you can use the maximum number of resources in the cluster. To do that, add the opportunistic priority class to your pods:

    spec:
      priorityClassName: opportunistic

Another thing to do is to avoid the GPU nodes. This way you can be sure you’re only using the CPU-only nodes and jobs are not preventing any GPU usage. To do this, add the node antiaffinity for GPU device to your pod:

    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: feature.node.kubernetes.io/pci-10de.present
                operator: NotIn
                values:
                - "true"

You can use a combination of 2 methods or either one.