Rev AI logo

Rev AI

Free tier

The world's most accurate speech-to-text API for developers, built for speed and global scale.

Free tier available·Technical·Powered by Rev AI (proprietary models trained on 7M+ hours of human-verified speech data)·API available

Key strengths

Industry-leading Word Error Rate (WER) across diverse accents, genders, and nationalitiesSupports 57+ languages with context-aware translationHIPAA, SOC II, GDPR, and PCI compliant with 99.99% uptimeBoth async (pre-recorded) and streaming (real-time) speech-to-text APIsAI Insights layer: sentiment analysis, topic extraction, summarization, and language identification
Free tier + paid plans
San Francisco, USA
Founded 2010
Self-hostable
No ratings yet

Rev AI API — Technical Reference

Authentication

All requests require a Bearer token in the Authorization header:

Authorization: Bearer YOUR_ACCESS_TOKEN

Async Speech-to-Text

Submit a job via POST /speechtotext/v1/jobs:

{
  "media_url": "https://example.com/audio.mp3",
  "metadata": "optional-job-label",
  "language": "en"
}

Poll job status at GET /speechtotext/v1/jobs/{id}, then fetch the transcript at GET /speechtotext/v1/jobs/{id}/transcript.

Streaming Speech-to-Text

Connect via WebSocket at wss://api.rev.ai/speechtotext/v1/stream with query parameters for access_token, content_type, and language. Send raw audio bytes and receive partial/final hypothesis JSON messages in real time.

Key Parameters

  • language — BCP-47 language code (57+ supported)
  • speaker_channels_count — Enable multi-speaker diarization
  • custom_vocabulary_id — Boost domain-specific terms
  • filter_profanity — Boolean profanity filter
  • remove_disfluencies — Strip filler words (uh, um)

SDKs

Official SDKs available for Python, Node.js, Java, C#, and Go. Integration time estimated under one hour.