Vespa logo

Vespa

Free tier

AI Search Platform for large-scale vector search, ranking, and real-time inference

Free tier available·Technical·API available·Open source

Key strengths

Hybrid vector + text + structured search in a single platformNative tensor support for complex ML-driven rankingReal-time inference at sub-100ms latency at billions-of-document scaleStreaming search mode for personal/private data (20x cheaper than indexing)Fully managed cloud offering (Vespa Cloud) plus open-source self-hosting
Free tier + paid plans
Oslo, Norway
Founded 2017
Self-hostable
No ratings yet

Vespa is a distributed AI search platform with native tensor computation support, enabling complex machine-learned model inference directly within the retrieval layer — eliminating the need for separate ranking services. It supports hybrid retrieval combining BM25 text search, approximate nearest neighbor (ANN) vector search, and structured filtering in a single query pass. Vespa's architecture provides automated horizontal scalability, continuous deployment with zero downtime, and a streaming search mode for per-user data that avoids expensive global indexing. The platform integrates with AWS and is available as a managed cloud service (Vespa Cloud) or as an open-source engine deployable on any infrastructure.