△EvalDevOps & InfraFree
Opik (by Comet)
Open-source LLM evaluation and observability platform — trace, test, and monitor LLM apps end-to-end.
Installs42k
Rating★ 4.5
Reviews14
Opik — LLM Evaluation & Observability
Opik is an open-source end-to-end LLM evaluation platform by Comet. It provides distributed tracing for LLM applications, a rich evaluation framework for testing prompts and RAG pipelines, and production monitoring — all in a single self-hostable platform.
Key Features
- LLM tracing: automatic instrumentation for LangChain, LlamaIndex, OpenAI, Anthropic
- Evaluation datasets: version and manage evaluation datasets with golden answers
- Automated scoring: built-in metrics (hallucination, answer relevance, context precision, BLEU)
- LLM-as-judge: define custom scoring with any LLM
- Online monitoring: track production metrics, catch regressions, set alerts
- Self-hostable with Docker Compose or Kubernetes
Quick Start
import opik
from opik.integrations.openai import track_openai
opik.configure(use_local=True) # self-hosted
client = track_openai(openai_client)
# All calls are now automatically traced
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is RAG?"}]
)
Install via ai-supply
npx ai-supply add opik-llm-evaluation-platform
Curated mirror of the open-source Opik (Apache-2.0). Get it from the source.