Gangmax Blog

Make "OpenCode" Work With Local Model

The following instructions show:

  1. How to run a local model with “llamafile”.

  2. Configure “opencode“(the local AI agent) to use the local model.

From “ChatGPT“ and the official documents.

Run Local Model

  1. Download “llamafile” from here, in my case it’s “llamafile-0.10.3”.

  2. Download a model from “huggingface“, such as “Qwen3-Coder-30B-A3B-Instruct-GGUF“. In my case “Qwen3-Coder-30B-A3B-Instruct-UD-Q6_K_XL.gguf” is used.

  3. Run the local model with the following command:

1
2
3
# More details can be found at "https://docs.mozilla.ai/llamafile/getting-started/quickstart".
# This will start an HTTP service listening at port 8080 by default.
./llamafile-0.10.3 -m Qwen3-Coder-30B-A3B-Instruct-UD-Q6_K_XL.gguf --server --jinja

Configure “opencode” to Use The Local Model

Install “opencode“ if you didn’t yet. In my case I used “npm i -g opencode-ai” to install.

Update the “~/.config/opencode/opencode.json” file with the following content, which tells “opencode” how to use the local running model you started:

~/.config/opencode/opencode.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"qwen-local": {
"npm": "@ai-sdk/openai-compatible",
"name": "Qwen Local",
"options": {
"baseURL": "http://127.0.0.1:8080/v1"
},
"models": {
"qwen3-coder": {
"name": "Qwen3-Coder 30B",
"reasoning": true,
"limit": {
"context": 128000,
"output": 16384
}
}
}
}
},
"model": "qwen-local/qwen3-coder"
}

Note that there’ no API key is required for the local running model. Now you can start “opencode”. If everything works, you should be able to use the local model in it.

Comments