Skip to content

Using LLMs on HCC resources

Large language models (LLMs) are large models that are pre-trained on vast amounts of data. LLMs can be used on different ways on HCC resources.

Open OnDemand Apps

  • LM Studio is available as Open OnDemand App. While LM Studio can utilize both CPU and GPU nodes, it is recommended to request a GPU such that the models run faster.

Note

When using LM Studio, please make sure that the CPU Thread Pool Size value matches the number of requested cores and the GPU Offload checkbox is selected and the slide is set to Max. These can be found in the Settings tab from the LM Studio GUI.

System-wide modules

  • We currently provide Ollama as system-wide module on Swan. This module can be loaded with:
    module purge
    module load ollama/0.11
    
    Examples of Ollama SLURM submit scripts can be found here.

Downloading models

You can download various models with both LM Studio and Ollama. The available models can be found on their respective websites:

Note

By default, the download location for Ollama models is $HOME/.ollama. This location can be changed with the OLLAMA_MODELS variable. For example, to use $NRDSTOR for the Ollama models, please use:

export OLLAMA_MODELS=$NRDSTOR/OLLAMA_MODELS

Note

By default, the download location for LM Studio models is $HOME/.lm_studio. This location can be changed within the LM Studio OOD App by navigating to the My Models tab and choosing a new directory with Models Directory.

Please note that the $WORK file-system has a purge policy and both $WORK and $NRDSTOR are not backed up.