Speechmatics API — Technical Reference

Authentication

All requests require a Bearer token in the Authorization header. Obtain your API key from the Speechmatics dashboard.

Real-Time Transcription (WebSocket)

Connect to the real-time endpoint via WebSocket and stream audio chunks. The API returns low-latency JSON transcripts, typically in under 1 second.

# Example: Start a real-time session (conceptual)
wscat -c wss://eu2.rt.speechmatics.com/v2 \
  -H "Authorization: Bearer <YOUR_API_KEY>"

Batch Transcription (REST)

curl -X POST https://asr.api.speechmatics.com/v2/jobs \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -F "data_file=@audio.wav" \
  -F 'config={"type":"transcription","transcription_config":{"language":"en"}}'

Key Parameters

Parameter	Description
`language`	BCP-47 language code (55+ supported)
`enable_partials`	Stream partial results before final transcript
`diarization`	Enable speaker separation (`speaker` or `channel`)
`operating_point`	`standard` or `enhanced` accuracy model
`custom_vocabulary`	Provide domain-specific terms to boost accuracy

Deployment Options

Cloud API — Multi-region SaaS (EU, US)
On-Premises — Docker/Kubernetes deployment with no data egress
On-Device — Optimized quantized models for edge hardware (e.g., laptop CPUs)