Name: Opik (by Comet)
Availability: InStock
Author: ai-supply

Opik — LLM Evaluation & Observability

Opik is an open-source end-to-end LLM evaluation platform by Comet. It provides distributed tracing for LLM applications, a rich evaluation framework for testing prompts and RAG pipelines, and production monitoring — all in a single self-hostable platform.

Key Features

LLM tracing: automatic instrumentation for LangChain, LlamaIndex, OpenAI, Anthropic
Evaluation datasets: version and manage evaluation datasets with golden answers
Automated scoring: built-in metrics (hallucination, answer relevance, context precision, BLEU)
LLM-as-judge: define custom scoring with any LLM
Online monitoring: track production metrics, catch regressions, set alerts
Self-hostable with Docker Compose or Kubernetes

Quick Start

import opik
from opik.integrations.openai import track_openai

opik.configure(use_local=True)  # self-hosted
client = track_openai(openai_client)

# All calls are now automatically traced
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is RAG?"}]
)

Install via ai-supply

npx ai-supply add opik-llm-evaluation-platform

Curated mirror of the open-source Opik (Apache-2.0). Get it from the source.

Opik (by Comet)

Opik — LLM Evaluation & Observability

Key Features

Quick Start

Install via ai-supply

More from @ai-supply