⇄ConnectorAgentic capabilityFree
Ollama
Run large language models locally with a single command. OpenAI-compatible API for Llama, Mistral, Gemma, and more.
Ollama
Ollama makes running open-weight LLMs locally as simple as ollama run llama3. It ships a cross-platform daemon with a fully OpenAI-compatible REST API, so any agent or tool that speaks to OpenAI can point to localhost:11434 and run on-device models with zero cloud dependency.
Key Features
- One-command setup —
curl -fsSL https://ollama.com/install.sh | shon Linux/macOS - Model library — 100+ curated models: Llama 3, Mistral, Gemma 3, Phi-4, Qwen, DeepSeek, and multimodal models
- OpenAI-compatible API — drop-in replacement for
openaiSDK clients by changing one base URL - GPU acceleration — NVIDIA CUDA, AMD ROCm, Apple Metal — automatic detection
- Modelfile — parameterize, layer adapters, and customize system prompts in a Dockerfile-like format
- Streaming — token-by-token streaming responses
Quick Start
# Install and run a model
ollama run llama3.2
# Use via OpenAI-compatible API
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hello!"}]}'
Install via ai-supply
npx ai-supply add ollama-local-model-runtime
Curated mirror of the open-source Ollama project (MIT). Install upstream from the repository.