◆SkillLanguage & NLPFree
Argilla — Collaborative Data Annotation for LLMs
Open-source annotation platform for building high-quality fine-tuning and RLHF datasets; integrates with Hugging Face Hub.
Installs48k
Rating★ 4.5
Reviews16
Argilla
Argilla is a collaboration platform for AI engineers and domain experts to annotate, curate, and quality-control datasets for fine-tuning and RLHF. It provides a polished web UI for labelling, a Python SDK for programmatic data management, and native Hugging Face Hub sync.
Key Features
- Web UI: label text, images, ranking, rating, span, multi-label tasks
- Custom schemas: define any labelling task with Argilla's dataset settings API
- Human + model-in-the-loop: pre-label with your model, human reviews
- RLHF/DPO: preference ranking and comparison tasks built-in
- Hugging Face Hub: push/pull datasets directly with
rg.Dataset.from_hub() - REST API + Python SDK + webhook integrations
- Self-hostable via Docker; managed via Hugging Face Spaces
Quick Start
import argilla as rg
client = rg.Argilla(api_url="http://localhost:6900", api_key="argilla.apikey")
dataset = rg.Dataset(
name="sentiment-annotation",
settings=rg.Settings(
fields=[rg.TextField(name="text")],
questions=[rg.LabelQuestion(name="label", labels=["positive", "negative", "neutral"])],
),
)
dataset.create()
records = [rg.Record(fields={"text": "Argilla makes annotation easy!"})]
dataset.records.log(records)
Install via ai-supply
npx ai-supply add argilla-dataset-annotation-platform
Curated mirror of the open-source Argilla (Apache-2.0). Get it from the source.