Developer Documentation

Installation

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Verify
ollama --version

Key CLI Commands

ollama pull llama3          # Download a model
ollama run llama3           # Run interactive session
ollama list                 # List installed models
ollama serve                # Start the local API server (default: port 11434)
ollama launch openclaw      # Launch a compatible app

REST API (OpenAI-compatible)

Ollama exposes an HTTP API at http://localhost:11434. It is compatible with the OpenAI Chat Completions API format:

curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Key Parameters

Parameter	Description
`model`	Model name (e.g., `llama3`, `mistral`, `gemma`)
`messages`	Chat history array
`stream`	Boolean — enable token streaming
`options`	Model options: `temperature`, `num_ctx`, `top_p`, etc.

Hardware Acceleration

Apple Silicon: Metal (automatic)
NVIDIA: CUDA (automatic if CUDA drivers present)
AMD: ROCm
Fallback: CPU inference