LLM Access

The Research and Computing Services created a Large Language Model server to provide access to the Carleton University community. REST API endpoints can be accessed upon request. There are many models deployed on this server, including:

all-minilm:33m
all-minilm:latest
command-r-plus:latest
command-r:latest
embeddinggemma:latest
gemma4:26b
gemma4:latest
gpt-oss:120b
gpt-oss:20b
llama3.1:latest
llama3.2-vision:90b
llama3.2:latest
llama3.3:latest
llama4:scout
medgemma1.5:4b
medgemma:27b
medgemma:4b
mistral-large:latest
mixtral:8x22b
mxbai-embed-large:latest
nemotron3:33b
qwen2.5:latest
qwen3-coder-next:q8_0
qwen3-coder:latest
qwen3-embedding:latest
qwen3.5:27b
qwen3.5:35b
qwen3.5:9b
qwen3.5:latest
qwen3.6:35b
qwen3:235b
qwen3:32b
qwen3:8b

The backend uses vLLM and Ollama to deploy models. You can use the instructions from the Ollama docs (give preference to OpenAI endpoints, as all backend servers work with these endpoints) and provide the header “x-api-key” or Bearer Authorization with your RCS API Key, which can be requested here. If approved, you will receive information about the host and API Key. A complete list of currently deployed models can be found here (Carleton VPN connection required).

The code below is an example of access using the Ollama package in Python:

import json
import requests
from argparse import ArgumentParser, BooleanOptionalAction

DEFAULT_HOST = "https://rcsllm.carleton.ca/rcsapi"
parser = ArgumentParser(description="RCS API Client Example")
parser.add_argument("--host", type=str, help="RCS api endpoint", required=False, default=DEFAULT_HOST)
parser.add_argument("--model", type=str, help="Model to use for requests", required=False, default="gpt-oss:120b")
parser.add_argument("--prompt", type=str, help="Prompt to send to the model", required=False, default="What is an LLM and how are you compared to other models?")
parser.add_argument("--stream", action=BooleanOptionalAction, help="Stream the response", required=False, default=True)
parser.add_argument("--api_key", type=str, help="API Key for authentication", required=True)
parser.add_argument("--list_models", action="store_true", help="List available models", required=False, default=False)

if __name__ == "__main__":
    args = parser.parse_args()
    api_key = args.api_key
    client = requests.Session()
    client.headers.update({"Authorization": f"Bearer {api_key}"})
    
    if args.list_models:
        models = client.get(f"{args.host}/v1/models").json()
        print("Available models:")
        for model in models['models']:
            print(f"- {model['model']}")
        exit(0)
    stream = args.stream
    
    response = client.post(f"{args.host}/v1/chat/completions", json={
        "model": args.model,
        "messages": [
            {"role": "user", "content": args.prompt}
        ],
        "stream": stream
    })
    
    if stream:
        for chunk in response.iter_lines():
            data = chunk.decode('utf-8').strip('data: ')
            if data:
                if data == "[DONE]":
                    break
                data = json.loads(data)
                if "reasoning" in data["choices"][0]["delta"]:
                    print(data["choices"][0]["delta"]["reasoning"], end="", flush=True)
                if "content" in data["choices"][0]["delta"]:
                    print(data["choices"][0]["delta"]["content"], end="", flush=True)
    else:
        print(response.json()['choices'][0]['message'])

If you wish to get access to this resource, fill out the form available here.