Name: Ollama
Availability: InStock
Author: ai-supply

Ollama

Ollama makes running open-weight LLMs locally as simple as ollama run llama3. It ships a cross-platform daemon with a fully OpenAI-compatible REST API, so any agent or tool that speaks to OpenAI can point to localhost:11434 and run on-device models with zero cloud dependency.

Key Features

One-command setup — curl -fsSL https://ollama.com/install.sh | sh on Linux/macOS
Model library — 100+ curated models: Llama 3, Mistral, Gemma 3, Phi-4, Qwen, DeepSeek, and multimodal models
OpenAI-compatible API — drop-in replacement for openai SDK clients by changing one base URL
GPU acceleration — NVIDIA CUDA, AMD ROCm, Apple Metal — automatic detection
Modelfile — parameterize, layer adapters, and customize system prompts in a Dockerfile-like format
Streaming — token-by-token streaming responses

Quick Start

# Install and run a model
ollama run llama3.2

# Use via OpenAI-compatible API
curl http://localhost:11434/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hello!"}]}'

Install via ai-supply

npx ai-supply add ollama-local-model-runtime

Curated mirror of the open-source Ollama project (MIT). Install upstream from the repository.

Ollama

Ollama

Key Features

Quick Start

Install via ai-supply

More from @ai-supply