Argilla
Free tierThe open-source collaboration tool for AI engineers and domain experts to build high-quality datasets
Free·All audiences·Powered by Hugging Face·API available·Open source
Key strengths
Open-source and self-hostable with full data controlIntuitive API for seamless integration into existing ML pipelinesSupports RLHF, fine-tuning, and active learning workflowsCombines AI automation with human-in-the-loop feedbackStrong community support and ecosystem via Hugging Face
Completely free
Madrid, Spain
Founded 2021
Self-hostable
No ratings yet
Argilla is an open-source data annotation and dataset-building framework tailored for NLP and LLM workflows. It provides a Python SDK (with an intuitive API) and a web-based UI that integrates with popular ML libraries, supporting pipelines for RLHF, supervised fine-tuning (SFT), and model evaluation. Its companion library, Distilabel, enables AI-assisted data generation and labeling at scale. Argilla can be self-hosted or run via the cloud, and its tight integration with the Hugging Face ecosystem makes it straightforward to push curated datasets directly to the Hub.
