Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inya.ai/llms.txt

Use this file to discover all available pages before exploring further.

Currently in beta. You’re on the priority waitlist and among the first to get access.

Overview

Stream audio in real-time with the lowest latency. Perfect for interactive assistants and live applications. For simpler use cases, see TTS REST or TTS SSE.
Passing numbers, IDs, dates, or currency as raw strings causes mispronunciations. See the Input Formatting Guide for correct formatting of phone numbers, account numbers, PINs, Aadhaar, vehicle registration numbers, GSTIN, currency, and more.

Endpoint

wss://api.vachana.ai/api/v1/tts

Authentication

All Realtime connections require the following headers:
HeaderRequiredDescriptionExample
Content-TypeYesMust be application/jsonapplication/json
X-API-Key-IDYesYour API key for authentication<your-api-key-id>

Request Format

Send a JSON message with the following structure:
{
  "text": "नमस्ते, आप कैसे हैं?",
  "model": "vachana-voice-v2",
  "audio_config": {
    "sample_rate": 44100,
    "encoding": "linear_pcm"
  }
}
num_channels
integer
required
Number of audio channels (e.g., 1 for mono, 2 for stereo)
sample_width
integer
required
Sample width in bytes (e.g., 2 for 16-bit audio)
encoding
string
required
Audio encoding format (e.g., linear_pcm)
container
string
required
Audio container format (e.g., wav)

Response

The server streams audio data in real-time as binary chunks. Each chunk contains PCM audio data according to the specified audio_config.

Example Usage

const ws = new WebSocket("wss://api.vachana.ai/api/v1/tts", {
  headers: {
    "Content-Type": "application/json",
    "X-API-Key-ID": "<your-api-key>",
  },
});

ws.on("open", () => {
  const request = {
    text: "नमस्ते, आप कैसे हैं?",
    model: "vachana-voice-v2",
    audio_config: {
      sample_rate: 44100,
      encoding: "linear_pcm",
    },
  };

  ws.send(JSON.stringify(request));
});

ws.on("message", (data) => {
  // Handle audio chunks
  console.log("Received audio chunk:", data);
});

ws.on("error", (error) => {
  console.error("WebSocket error:", error);
});

ws.on("close", () => {
  console.log("WebSocket connection closed");
});

Python SDK

The SDK’s realtime client manages the WebSocket lifecycle, audio streaming, and async iteration so you can focus on your application logic.

Installation

pip install gnani-vachana
Requires Python 3.9+.

Authentication

from gnani.tts import GnaniTTSRealtimeClient

client = GnaniTTSRealtimeClient(api_key="your-api-key")

Stream Audio Chunks in Real-Time

Use the async context manager to open the connection and iterate over audio chunks as they arrive.
import asyncio
from gnani.tts import GnaniTTSRealtimeClient

async def main():
    async with GnaniTTSRealtimeClient(api_key="your-api-key") as client:
        with open("output.wav", "wb") as f:
            async for chunk in client.synthesize(
                "नमस्ते, आप कैसे हैं?",
                voice="sia",
            ):
                f.write(chunk)

asyncio.run(main())

Collect All Audio at Once

If you don’t need to process chunks as they arrive, use synthesize_and_collect to get the full audio as a single bytes object.
import asyncio
from gnani.tts import GnaniTTSRealtimeClient

async def main():
    async with GnaniTTSRealtimeClient(api_key="your-api-key") as client:
        audio = await client.synthesize_and_collect(
            "Realtime TTS response",
            voice="neha",
        )
        with open("output.wav", "wb") as f:
            f.write(audio)

asyncio.run(main())