CryoSPARC Interactive App

CryoSPARC is a complete software solution for processing cryo-electron microscopy (cryo-EM) data. On HCC resources, CryoSPARC can be accessed as an Interactive GUI App via the Swan Open OnDemand (OOD) portal.

Launching CryoSPARC via the Open OnDemand portal starts an interactive Desktop that runs the CryoSPARC “master” process along with a single worker process on the allocated compute node. This is very similar to a Single Workstation install as described in the CryoSPARC documentation on a machine with modest resources. While one can use the interactive Desktop to perform CryoSPARC analyses, the Desktop is mostly used to submit computationally intensive (including GPU) CryoSPARC jobs to the cluster via SLURM. Therefore, launching the Desktop does not require significant CPU or RAM resources, nor a GPU node. On the other hand, the submitted CryoSPARC jobs can take a full advantage of GPU nodes. The particular Swan GPU nodes used for the submitted CryoSPARC jobs can be set using the Show advanced settings… checkbox in the CryoSPARC OOD Form.

The CryoSPARC App and submitted tasks run as SLURM jobs under your personal HCC account. Similar to any other SLURM job, you have a full control over the processes and the data files. The CryoSPARC projects and data files on Swan can only be accessed by you, unless you explicitly choose to share them.

Basic usage

  • Navigate to the Swan OOD portal and login with your HCC credentials.
  • Click “Interactive Apps” followed by CrysoSPARC.
  • This will open a form that needs to be filled with information regarding CryoSPARC and the resources requested. Below is explanation of some of the required fields to launch the CryoSPARC App:

    • CryoSPARC License ID: To start CryoSPARC you will need to have a license ID. Each user must have their own license ID. They may not be shared, even within a lab group. Attempting to share a license ID will cause jobs to fail. If you have multiple CryoSPARC sessions running at the same time, then you will need different license ID for each session. CryoSPARC license IDs are free for non-profit academic use and can be obtained from here.
    • Start a new CryoSPARC session?: The default location of the CryoSPARC database is $WORK/cryosparc. If you select this checkbox, all the existing database and configuration files in the CryoSPARC session location will be erased. Please select this checkbox only if needed.
    • Number of cores: This parameter defines the cores needed for the CryoSPARC “master” process. The “master” process does not require many resources, so requesting 1-2 cores should be sufficient.
    • Running time in hours: This parameter defines the runtime of the CryoSPARC “master” process. The “master” process should be running while the CryoSPARC SLURM jobs are running. If you are not sure how long the submitted jobs will be running for, please select the maximum runtime of 168 hours (7 days) to avoid any issues if the CryoSPARC “master” process finishes before the submitted jobs. While the CryoSPARC “master” process is running and the GPU CryoSPARC jobs are submitted to the cluster, you don’t need to have the CryoSPARC OOD App open.
    • Requested RAM in GBs: This parameter defines the RAM memory needed for the CryoSPARC “master” process. The “master” process does not require many resources, so requesting 8GBs for example should be sufficient.
    • Partition selection: This parameter defines the partition used for the CryoSPARC “master” process. The “master” process does not require GPUs, so you can you use a CPU partition, such as batch, or any other leased partition you have access to.
  • In addition to the basic fields, there are two advanced settings you can set under the Show advanced settings… checkbox in the CryoSPARC OOD Form:

    • CryoSPARC session path: The default location of the CryoSPARC database and configuration files is $WORK/cryosparc. You can change this location using the Select Path button. You can select any of the available file systems on Swan aside from the $HOME filesystem. We do not recommend using $HOME for the session folder. Please note that each file system has its own advantages and disadvantages. Note this location is *not* the same as the project path(s) where projects are stored. This location is used for CryoSPARC’s internal database and configuration. Do not change this value unless you are absolutely sure of the ramifications.
    • Specify partition for CryoSPARC GPU jobs: This parameter defines the GPU partition used for the GPU CryoSPARC jobs submitted to the cluster via SLURM through the interactive Desktop. If not specified, by default, these jobs are submitted to the general gpu partition. If you have access to a priority access partition with GPU resources, you may specify that partition here to reduce queue time.
    • Highmem factor: This parameter defines the multiplicative factor used to increase the memory (RAM) request for the highmem cluster lane (described below). That is, the preset memory values within CryoSPARC for each job type will be multiplied by this factor. Certain input sets combined with particular options may require more memory than the preset CryoSPARC values, and will otherwise fail. Use this value in combination with using the highmem cluster lane to increase the requested memory on specified jobs to allow them to complete successfully. The valid range are integer values from 1 to 10.
    • Batch job maximum runtime: This parameter controls the maximum runtime of the batch jobs CryoSPARC submits to SLURM via the cluster lanes. This value must be greater than the longest CryoSPARC task submitted to the cluster lanes else those tasks will be killed by the scheduler. Smaller values will generally decrease queue time, so if you are confident all tasks for your particular datasets will complete sooner, you may lower this value to improve job throughput and decrease queue time.
    • Allow multiple CryoSPARC masters: By default only one instance of the CryoSPARC app can be run at a time, as the app launches the CryoSPARC master process. Selecting this box overrides this limitation and allows multiple copies of the app (i.e. master) to run. This can be useful for testing a newer CryoSPARC version side by side with an existing version for example. In order to avoid data corruption and loss, you must use a distinct session path for each master. Multiple masters cannot share a session path! HCC strongly recommends against using this option unless you know exactly what you’re doing and have a secure backup of both your project(s) and CryoSPARC database(s).

After selecting the needed resources and pressing “Launch”, the CryoSPARC App will start. Depending on the requested resources, this process should take a few minutes.
When the App is ready, the CryoSPARC OOD Info Card will show information about the email and password needed to login to CryoSPARC, as well as some other useful information, such as the Login URL and a few auxiliary programs. The email address value to use is <username>@swan.unl.edu, where <username> is replaced with your HCC username. Please note that this is not a functioning email address, and is only used for logging into CryoSPARC on Swan. Once the session is ready, you will see a “Launch CryoSPARC:Swan” button. This button will start the CryoSPARC interactive Desktop and open the Firefox browser to the CryoSPARC login page. Please use the provided email and password to login to the CryoSPARC Firefox session and start using CryoSPARC.

CryoSPARC SLURM jobs

Once you create CryoSPARC Job and are ready to Queue it, you can select one of the two available lanes to run the job on - default or swan.

  • Lane default (node) uses the current compute node to run the job on. This job will use the resources and partition selected in the CryoSPARC OOD Form. This will usually be a modest amount of resources (1-2 cores and 8GBs of RAM), so the default lane should only be used for short-running, light tasks.
  • Lane swan (cluster) uses SLURM to submit the CryoSPARC job to the cluster. These jobs have a maximum runtime of 7 days, use the number of CPUs or GPUs you have specified when creating the CryoSPARC job and use the partition you have selected under the Show advanced settings… checkbox in the CryoSPARC OOD Form. The submitted CryoSPARC jobs will run in the background, and depending on the requested resources and partition utilization, sometimes it may take a bit before they start running. The swan lane should be used for the majority of the computing tasks.
  • Lane swan-highmem (cluster) is otherwise identical to the swan lane, but increases the amount of memory (RAM) requested by a constant factor. Certain jobs may require more memory than the preset values within CryoSPARC. This lane can be used for jobs that will otherwise fail in the standard swan lane due to insufficient memory. The default is to multiply the preset CryoSPARC values by a factor of 4, but this factor may be changed when the CryoSPARC app is launched under advanced settings. Using this lane when it is not needed will result in longer queue times.

Please note that the “master” process should be running while the submitted CryoSPARC jobs are queued and running.

CryoSPARC auxiliary programs

In addition to CryoSPARC, the CryoSPARC OOD App provides a few auxiliary programs, such as wrapper scripts for Topaz and deepEMhancer.
The executable paths to these scripts need to be set manually on a project-level. Once this is done, the set location will apply to all future newly created jobs.

  • The location of the wrapper script for Topaz is: /usr/local/bin/topaz.sh
  • The location of the wrapper script for deepEMhancer is: /usr/local/bin/deepemhancer.sh
  • The location of the deepEMhancer models is: /usr/local/share/deepemhancer_models/

CryoSPARC database

The default location of the CryoSPARC database is $WORK/cryosparc.

  • If you need to move this database elsewhere, please change the CryoSPARC session path using Select Path under the Show advanced settings… checkbox.
  • If you want to store the CryoSPARC session folder (database and configuration files) elsewhere, please change the CryoSPARC session path before you start the CryoSPARC session.

CryoSPARC example

If you want to test the CryoSPARC OOD App, you can use the CryoSPARC Introductory Tutorial.

Tips

  • The “master” process should be running while the submitted CryoSPARC jobs as part of the workflow are queued and running. The “master” process can be terminated if:
    • the requested runtime set via the Running time in hours field has been reached
    • the Linux Desktop environment is exited by choosing Log Out from either via the deskop menu in the upper left, or the account name in the upper right
    • the interactive App has been terminated via the"Delete" button on OOD.
  • If you are not sure how long the jobs will be running for, please set Running time in hours to 168 hours (7 days) to avoid any issues.
  • In order for CryoSPARC to shut down gracefully, please end your session by choosing Log Out from the main desktop menu in the upper left corner within the CryoSPARC App when all jobs are no longer running or queued. You may then start the App again later and resume your progress.
  • The optimal number of GPUs needed for the submitted CryoSPARC SLURM jobs that utilize GPUs is 2 to 4 GPUs, so please adjust this value as part of the Project -> Job Builder accordingly.
  • You can only run one CryoSPARC instance at a time with the same license ID and database directory. If you need to run multiple CryoSPARC sessions at the same time, please request a different license and change the CryoSPARC session directory by selecting the Show advanced settings… checkbox. We do not generally recommend this approach however. Jobs submitted to SLURM will not complete faster by using multiple instances.
  • If you are unable to start a new CryoSPARC instance there is a chance that your previous submitted CryoSPARC jobs didn’t complete successfully. Please make sure all your running CryoSPARC jobs are cancelled with scancel and if necessary, delete any *.lock files in the used CryoSPARC session directory.
  • If you accidentally close the Firefox browser inside the CryoSPARC OOD App, the Login URL is printed on the CryoSPARC OOD Info Card.
  • Depending on the partition utilization, sometimes it may take a bit before the submitted CryoSPARC jobs start running.
  • If you need to transfer data to/from Swan as part of your CryoSPARC workflow please see the Data Transfer page.

FAQ

  • How to check the logs generated from my CryoSPARC job?
    • The logs associated with the CryoSPARC Open OnDemand App can be found in $WORK/.ondemand/batch_connect/sys/bc_hcc_cryosparc/swan/output/<session_id>/output.log, where <session_id> should be replaced with the Session ID printed in the CryoSPARC Info Card. The format of the Session ID is xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
    • The logs associated with your specific CryoSPARC project can be found in the job.log file in the Project directory.
  • Why am I getting “Found no NVIDIA driver on your system.” error?
    • The error means that your CryoSPARC job is not running on a GPU node, so please make sure you specify GPU partition for your “master” process if you use the default lane.
  • How to check the status of my completed CryoSPARC jobs?
    • You can monitor completed jobs using seff and sacct. For example, to find the SLURM Job ID, SLURM Job Name, SLURM Job State, the node the job ran on, the total runtime, as well as the requested and used memory of all your jobs ran today, you can use: sacct --format=JobId,JobName%50,State,Node,Elapsed,MaxRSS,ReqMem
    • The SLURM Job Name of the CryoSPARC “master” process and the jobs that run on the default lane is always ondemand/sys/dashboard/sys/bc_hcc_cryosparc/swan.
    • The SLURM Job Name of the CryoSPARC jobs that run on the swan or swan-highmem lanes is always in the format cryosparc_<project_uid>_<job_uid>, where <project_uid> and <job_uid> are replaced with the CryoSPARC Project ID and CryoSPARC Job ID respectively. For example, if your CryoSPARC Project ID is 3, and your CryoSPARC Job ID is 187, the SLURM Job Name will be cryosparc_P3_J187.
  • I moved my CryoSPARC project directory to a different location on the cluster, and I can not access my project anymore.
    • To move CrypSPARC project directory from one location to another please archive, move and unarchive the project as explained in the CryoSPARC documentation.
    • If moving the CrypSPARC project directory from one location to another as listed above is not working, you can try deleting the cs.lock file in the project directory of an inoperable instance as explained in the CryoSPARC documentation.
  • My CryoSPARC job failed and I don’t know why.
    • If your CryoSPARC job fails, it is highly likely that the requested memory was exceeded. You can check this with seff or sacct.
    • If you use the default lane, please increase the value in the Requested RAM in GBs field from the CryoSPARC Open OnDemand Form.
    • If you use the swan lane, please try the swan-highmem lane instead and increase the Highmem factor value in the CryoSPARC Open OnDemand Form accordingly.
  • I tried everything, and my CryoSPARC job still fails, what can I do?
    • You can always email hcc-support@unl.edu with any additional questions or errors you have. To better assist you, please include the SLURM Job ID and the CryoSPARC Session ID of the erroneous job in your email.

If you have any questions or encounter any issues with the CryoSPARC OOD App, please email hcc-support@unl.edu.