TL;DR: OpenCode is an open-source terminal coding agent. Ollama runs local LLMs on your machine. Together they give you a free, private Claude-Code-like experience with no API keys or monthly fees. This guide covers everything — installation, context window fixes, config, and the best models to use in 2026.

What is OpenCode — and Why It Matters

OpenCode is an open-source, terminal-based AI coding agent — think Claude Code or GitHub Copilot Chat, but fully provider-agnostic and free. It runs in your terminal as a TUI (terminal user interface) and can browse your codebase, edit files, run shell commands, and reason about multi-file architecture.

Unlike proprietary tools, OpenCode lets you connect to any LLM provider — Anthropic, OpenAI, Google, or a completely local model via Ollama. When you pair it with Ollama, your code never leaves your machine. No cloud, no telemetry, no usage limits.

100% Private

Your codebase stays local. No data sent to external servers when using Ollama as the provider.

Zero API Cost

Run as many sessions as you want. No tokens billed, no monthly subscription for the model.

Plan + Build Modes

Press Tab to toggle between Plan mode (see the diff before changes) and Build mode (execute directly).

Multi-Model Support

Define multiple Ollama models in config. Switch with /models during any session.

opencode vs claude code local free AI coding agent terminal 2026 run coding agent without API key open source claude code alternative

Prerequisites & System Requirements

Before starting, make sure your machine meets the minimum requirements. Running local LLMs is GPU-bound — the more VRAM or unified memory you have, the larger and better the model you can run.

Requirement Minimum Recommended Notes
Node.js v18+ v20+ OpenCode is a JS app published on npm
RAM / VRAM 8 GB 16–24 GB Apple Silicon uses unified memory (great for Ollama)
Ollama v0.15+ Latest Required for ollama launch command
OS macOS / Linux macOS (Apple Silicon) Windows via WSL2 also works
Context Window 16k tokens 32k–64k tokens OpenCode needs at least 16k for tool use to work

Step 1 — Install Ollama and Pull a Model

Ollama is a local LLM runtime that manages model downloads and exposes an OpenAI-compatible API on localhost:11434. OpenCode communicates with it via the /v1 endpoint.

bash — Install Ollama
# macOS (Homebrew)
brew install ollama
 
# Linux (one-liner)
curl -fsSL https://ollama.com/install.sh | sh
 
# Verify install
ollama --version
 
# Start the server (keep this running in a terminal tab)
ollama serve

Now pull a coding-optimised model. For most machines, qwen2.5-coder:7b is the best starting point. If you have 16GB+ VRAM or unified memory, go for the 32B variant.

bash — Pull models
# Best balance of quality + speed (8B params, ~5GB)
ollama pull qwen2.5-coder:7b
 
# Larger, more capable (requires 16GB+ VRAM)
ollama pull qwen3-coder:30b
 
# Fastest option for low-spec machines
ollama pull deepseek-coder:6.7b
 
# Verify models are ready
ollama list

// Info — ollama launch shortcut

With Ollama v0.15+, you can skip manual config entirely: ollama launch opencode --model qwen3-coder auto-configures everything. This guide covers the manual path so you understand what's happening under the hood.


Step 2 — Fix the Context Window (Critical)

This is the step most tutorials skip — and the reason tool calls silently fail. Ollama defaults all models to a 4,096-token context window, even when the model supports 128k+. OpenCode needs at least 16k tokens to use agentic tools like file editing and bash execution.

⚠ Warning — Don't Skip This Step

If your context window stays at 4k, OpenCode will start but file operations and commands won't execute. This is the #1 reported setup issue. Always create a custom model variant with an extended context before configuring OpenCode.

bash — Create a 16k context model variant
# Open the model in an interactive session
ollama run qwen2.5-coder:7b
 
# Inside the Ollama prompt, set context length
>>> /set parameter num_ctx 16384
 
# Save as a new named variant
>>> /save qwen2.5-coder:7b-16k
 
# Exit
>>> /bye
 
# For larger RAM machines — 32k context (better results)
ollama run qwen2.5-coder:7b
>>> /set parameter num_ctx 32768
>>> /save qwen2.5-coder:7b-32k
>>> /bye

Alternatively, if you run Ollama as a systemd service you can set the context globally via an environment variable:

ini — systemd environment variable
# /etc/systemd/system/ollama.service (add to [Service] block)
Environment="OLLAMA_CONTEXT_LENGTH=32000"

Step 3 — Install OpenCode

OpenCode is distributed as an npm package. Install it globally so you can run opencode from any directory.

bash — Install OpenCode globally
# Install via npm (Node.js 18+ required)
npm install -g opencode-ai
 
# Verify install
opencode --version
 
# If npm global bin isn't in PATH, add it:
export PATH="$PATH:$(npm root -g)/../bin"
 
# Initialise in your project directory
cd ~/your-project
opencode init

Running opencode init creates an agents.mmd file that maps your project structure — this is how OpenCode understands your codebase before you write a single prompt.


Step 4 — Configure opencode.json for Ollama

OpenCode reads provider configuration from ~/.config/opencode/opencode.json. Create or edit this file to point at your local Ollama server. The key detail: OpenCode uses the /v1 OpenAI-compatible endpoint — not Ollama's native API.

json — ~/.config/opencode/opencode.json (single model)
{
  "$schema": "https://opencode.ai/config.json",
  "model": "ollama/qwen2.5-coder:7b-16k",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama (Local)",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-16k": {
          "name": "Qwen2.5 Coder 7B (16k)",
          "tools": true
        }
      }
    }
  }
}

Want to configure multiple models and switch between them? Here's a multi-model config:

json — Multiple models config
{
  "$schema": "https://opencode.ai/config.json",
  "model": "ollama/qwen2.5-coder:7b-16k",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama (Local)",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b-16k": {
          "name": "Qwen2.5 Coder 7B — Fast",
          "tools": true
        },
        "qwen3-coder:30b-32k": {
          "name": "Qwen3 Coder 30B — Powerful",
          "tools": true
        },
        "deepseek-coder:6.7b-16k": {
          "name": "DeepSeek Coder 6.7B — Lightweight",
          "tools": true
        }
      }
    }
  }
}

// Tip — Verify Ollama is reachable

Before launching OpenCode, confirm Ollama's /v1 endpoint responds: curl http://localhost:11434/v1/models — you should see a JSON list of your pulled models.


Step 5 — Run OpenCode and Start Coding

With Ollama running and the config in place, launch OpenCode from inside your project directory:

bash — Launch OpenCode
# Make sure Ollama is still running in another terminal
ollama serve
 
# Navigate to your project, then start OpenCode
cd ~/your-project
opencode
 
# Or run a one-shot task non-interactively
opencode run "add TypeScript types to all functions in src/api.js" \
  --model ollama/qwen2.5-coder:7b-16k

Once inside the TUI, here are the key commands to know:

Shortcut / Command Action
TabToggle between Plan mode and Build mode
Ctrl + POpen command palette (rename, session, switch, etc.)
/modelsSwitch to a different configured model mid-session
/connectAdd a new provider (cloud or local)
EscExit command palette or cancel current action
/no_thinkPrepend to prompts to skip chain-of-thought (faster)
how to run opencode locally with ollama opencode terminal TUI guide 2026 opencode plan mode vs build mode opencode one-shot command line

Best Ollama Models for OpenCode in 2026

Not all Ollama models support agentic tool use. Only models specifically trained for function calling and tool execution work correctly with OpenCode's file and bash operations. Here's the current best options ranked by use case:

Model Size RAM Required Tool Use Best For
qwen2.5-coder:7b 7B 8 GB ✓ Full Best starter — fast, capable, wide compatibility
qwen3-coder:30b 30B MoE 16 GB ✓ Full Best quality/speed for 16GB+ machines
qwen3:8b 8B 8 GB ✓ Full General reasoning + code, good for multi-turn
deepseek-coder:6.7b 6.7B 6 GB Partial Low-RAM fallback, analysis tasks
codellama:7b 7B 8 GB Limited Legacy option, use qwen2.5-coder instead
glm-4.7:cloud Cloud None (API) ✓ Full Ollama's recommended cloud model for OpenCode

⚠ Model Compatibility Note

Models like Mistral Nemo and Granite can analyse code in OpenCode but cannot create or modify files — they lack proper tool-calling training. Stick to the Qwen3 family for full agentic capabilities.


Troubleshooting Common Issues

Problem Likely Cause Fix
Tool calls silently fail / files not created Context window too small (4k default) Re-save model with num_ctx 16384 as shown in Step 2
connection refused on startup Ollama server not running Run ollama serve in a separate terminal tab first
Model not found in OpenCode Wrong model name in config JSON Run ollama list and copy the exact name into opencode.json
Very slow responses Model too large for available memory Switch to a smaller model variant (7B instead of 30B)
OpenCode doesn't see Ollama provider in /connect Ollama not listed in connect UI by default Manually create the provider in opencode.json as in Step 4

Practise Your Technical Interviews with AI

OpenCode helps you build faster. Cloudvyn's AI interview platform helps you land the job. Practice unlimited mock interviews, get real-time feedback on your answers, and master behavioural rounds.

→ Try Free AI Interview Practice

Frequently Asked Questions

Does OpenCode work with Ollama without an API key?

Yes. When using a local Ollama instance, no API key is required. OpenCode connects directly to http://localhost:11434/v1. If the config prompts for a key, any placeholder string will work.

Which is the best model to use with OpenCode locally in 2026?

For most machines, qwen2.5-coder:7b-16k is the best starting point — it has strong tool-calling support and runs well on 8GB RAM. If you have 16GB+ unified memory (e.g. M2/M3 Mac), upgrade to qwen3-coder:30b for significantly better results.

Why aren't file operations working in OpenCode with Ollama?

The most common cause is the context window being stuck at Ollama's 4k default. You must manually save a model variant with num_ctx set to at least 16,384. Follow Step 2 in this guide exactly.

Can I use OpenCode with both local Ollama models and cloud models?

Yes — OpenCode supports multiple providers simultaneously. You can define Ollama local models alongside providers like Anthropic or OpenAI in the same opencode.json and switch between them with /models during a session.

Is OpenCode + Ollama a good free alternative to Claude Code?

For personal projects, scripts, and homelab work — yes, it's excellent and completely free. For professional production codebases requiring complex multi-file reasoning at scale, cloud models like Claude Sonnet 4.5 via Claude Code or Cursor still outperform local 7–30B models. The gap is narrowing fast though.

Related Reading