Lehmus AI: Available LLM Models

The following language models are currently available on the Lehmus AI platform. The list below shows the typical use cases of each model as well as the most important supported features.

Gemma 4 31B

Use case: general-purpose, analysis, reasoning, coding

google/gemma-4-31b-it
model id: jwwqblcgkizlhxjjbkcp
enable-auto-tool-choice
reasoning-parser=gemma4
tool-call-parser=gemma4
max-model-len=131072

Gemma 4 26B MoE

Use case: long materials, summarization, analysis, coding

google/gemma-4-26B-A4B-it
model id: wnofibsjlomanbprgbdp
enable-auto-tool-choice
reasoning-parser=gemma4
tool-call-parser=gemma4
max-model-len=131072

GPT OSS 120B

Use case: analysis, problem-solving, agents, tools

openai/gpt-oss-120b
model id: azsydsttjnlbfjbgqnwd
enable-auto-tool-choice
reasoning-parser=openai_gptoss
tool-call-parser=openai
kv-cache-dtype=fp8

Qwen3.6-35B-A3B

Use case: coding, agents, reasoning, tool calls

qwen/qwen3.6-35B-A3B
model id: ctraxdkjuwfrlivzrhgt
enable-auto-tool-choice
reasoning-parser=qwen3
tool-call-parser=qwen3_xml
max-model-len=131072
max-num-batched-tokens=4096
attention-backend=FLASHINFER

Qwen3.6-27B

Use case: general-purpose, conversation, analysis, coding

qwen/qwen3.6-27B
model id: hggelmtxxpzxwqucbjha
enable-auto-tool-choice
reasoning-parser=qwen3
tool-call-parser=qwen3_xml
max-model-len=262144
speculative-config.num_speculative_tokens=2
speculative-config.method=mtp
max-num-batched-tokens=4096
attention-backend=FLASHINFER

Qwen2.5-72B-Instruct-AWQ

Käyttötarkoitus: koodaus, agentit, päättely, työkalukutsut

qwen-qwen2-5-72b-instruct-awq
model id: Qwen/Qwen2.5-72B-Instruct-AWQ
enable-auto-tool-choice
tool-call-parser=hermes
max-model-len=32768
enforce-eager
max-num-batched-tokens=8192
dtype=float16
quantization=awq

DeepSeek R1 Distill Llama 70B FP8 dynamic

Käyttötarkoitus: koodaus, agentit, päättely, työkalukutsut

neuralmagic-deepseek-r1-distill-llama-70b-fp8-dynamic
model id: neuralmagic/DeepSeek-R1-Distill-Llama-70B-FP8-dynamic
enable-auto-tool-choice
reasoning-parser=deepseek_r1
tool-call-parser=hermes
max-model-len=32768
max-num-seqs=2
max-num-batched-tokens=8192
trust-remote-code
tokenizer-mode=auto
tensor-parallel-size=1
dtype=auto
tokenizer=neuralmagic/DeepSeek-R1-Distill-Llama-70B-FP8-dynamic

ICT Services aim to keep the model catalog as diverse as possible and to add new models whenever feasible. However, capacity is limited, so not all models can be offered. Instead, the selection is developed to support as many use cases as possible.

Is the language model you want not available?

Please first check whether you could use one of the existing language models for your purpose. You can request a new language model for the platform using the following form: requesting a new language model. After you submit a request for a new language model, our team will evaluate it based on the following criteria:

Hardware compatibility: Assessment of VRAM requirements in relation to the available resources
Data protection: We ensure that the model can be run safely in our environment without known side effects

« Back

This article was published in categories English version available, All instructions, for the University of Oulu staff, UniOulu and tags Confidential Minds, ConfidentialMinds, Kielimallit, Lehmus AI, LLM. Add the permalink to your favourites.