Currently in beta. You’re on the priority waitlist and among the first to get access.
Overview
Stream audio in real-time with the lowest latency. Perfect for interactive assistants and live applications. For simpler use cases, see TTS REST or TTS SSE.Passing numbers, IDs, dates, or currency as raw strings causes mispronunciations. See the Input Formatting Guide for correct formatting of phone numbers, account numbers, PINs, Aadhaar, vehicle registration numbers, GSTIN, currency, and more.
Endpoint
Authentication
All Realtime connections require the following headers:| Header | Required | Description | Example |
|---|---|---|---|
Content-Type | Yes | Must be application/json | application/json |
X-API-Key-ID | Yes | Your API key for authentication | <your-api-key-id> |
Request Format
Send a JSON message with the following structure:Number of audio channels (e.g.,
1 for mono, 2 for stereo)Sample width in bytes (e.g.,
2 for 16-bit audio)Audio encoding format (e.g.,
linear_pcm)Audio container format (e.g.,
wav)Response
The server streams audio data in real-time as binary chunks. Each chunk contains PCM audio data according to the specifiedaudio_config.