DVC (Data Version Control) logo

DVC (Data Version Control)

Free tier

Manage data the way code is managed — Git-like version control for AI/ML and data science.

Free tier available·Technical·API available·Open source

Key strengths

Git-like versioning for datasets and ML modelsOpen source with a large, active communitySeamlessly integrates with existing Git workflowsSupports petabyte-scale data lakes and object stores via lakeFSWorks with major cloud storage providers and local filesystems
Free tier + paid plans
San Francisco, USA
Founded 2017
Self-hostable
No ratings yet

DVC is a command-line tool and VS Code extension that acts as a Git extension, storing metadata and pointers in your Git repository while offloading large data files and model artifacts to remote storage (S3, GCS, Azure Blob, SSH, and more). It enables reproducible ML pipelines through a DAG-based pipeline system, and tracks experiments with lightweight branching semantics. lakeFS, its enterprise counterpart, provides a full Git-like branching model directly on top of object stores (S3-compatible, Azure Data Lake, GCS) for teams managing complex, large-scale data infrastructure.