How can I quickly run a LLM on the cluster?
CPU
- allocate a full node for 4 hours (adjust as you see fit)
srun -p public -n 64 -N 1 --mem=0 -t 4:00:00 --pty /bin/bash
GPU
-
allocate a GPU for 4 hours (adjust as you see fit)
srun -p gpu-a100 -n 32 -N 1 --mem=128G -t 4:00:00 --pty /bin/bash -
module load apptainer -
get the container (only needed the first time)
apptainer pull docker://ollama/ollama -
run the server in the background
apptainer run ollama_latest.sif & -
run the model interactively (there is a list of models here that you can pick from GitHub - ollama/ollama: Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.)
apptainer run ollama_latest.sif run llama3.1:8b "$(cat /PATH/TO/YOUR/FILE.csv)" please summarize this data