Argilla
Free tierThe open-source collaboration tool for AI engineers and domain experts to build high-quality datasets
Free·All audiences·Powered by Hugging Face·API available·Open source
Key strengths
Open-source and self-hostable with full data controlIntuitive API for seamless integration into existing ML pipelinesSupports RLHF, fine-tuning, and active learning workflowsCombines AI automation with human-in-the-loop feedbackStrong community support and ecosystem via Hugging Face
Completely free
Madrid, Spain
Founded 2021
Self-hostable
No ratings yet
Developer Documentation
Argilla provides a full-featured Python SDK and REST API for integrating dataset curation into your MLOps pipeline.
Installation
pip install argilla
Quickstart Example
import argilla as rg
# Connect to your Argilla server
rg.init(api_url="http://localhost:6900", api_key="YOUR_API_KEY")
# Create and log a dataset
dataset = rg.DatasetForTextClassification.from_pandas(df)
rg.log(dataset, name="my-classification-dataset")
Key Capabilities for Developers
- Distilabel integration — Use the companion
distilabellibrary for scalable AI-assisted data generation and synthetic labeling pipelines. - Hugging Face Hub — Push and pull datasets directly to/from the Hugging Face Hub using native integrations.
- Active learning support — Plug in your models to prioritize uncertain samples for annotation, reducing labeling costs.
- REST API — Full REST API available for custom integrations and programmatic dataset management.
- Self-hosting — Deploy via Docker with full control over your data and infrastructure.
Refer to the Argilla Docs and Distilabel Docs for full API references and advanced guides.
