Text-to-Speech (REST)

TTS Inference

curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "audio_config": {
    "bitrate": "192k",
    "container": "mp3",
    "encoding": "linear_pcm",
    "num_channels": 1,
    "sample_rate": 44100,
    "sample_width": 2
  },
  "model": "vachana-voice-v2",
  "text": "नमस्ते, आप कैसे हैं?"
}
'

"<string>"

POST

api

tts

inference

TTS Inference

curl --request POST \
  --url https://api.vachana.ai/api/v1/tts/inference \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key-ID: <x-api-key-id>' \
  --data '
{
  "audio_config": {
    "bitrate": "192k",
    "container": "mp3",
    "encoding": "linear_pcm",
    "num_channels": 1,
    "sample_rate": 44100,
    "sample_width": 2
  },
  "model": "vachana-voice-v2",
  "text": "नमस्ते, आप कैसे हैं?"
}
'

"<string>"

Currently in beta. You’re on the priority waitlist and among the first to get access.

Overview

Get the complete synthesized audio in one response. Best for downloads or batch processing. For streaming playback, see TTS Streaming or TTS Realtime.

Passing numbers, IDs, dates, or currency as raw strings causes mispronunciations. See the Input Formatting Guide for correct formatting of phone numbers, account numbers, PINs, Aadhaar, vehicle registration numbers, GSTIN, currency, and more.

Python SDK

The official Python SDK lets you synthesize speech in one line, without constructing JSON payloads or handling binary audio responses manually.

Installation

pip install gnani-vachana

Requires Python 3.9+.

Authentication

The TTS client requires only your API key.

from gnani.tts import GnaniTTSClient

client = GnaniTTSClient(api_key="your-api-key")

Synthesize Speech

The synthesize method returns the complete audio as bytes, which you can write to a file or pass directly to an audio player.

from gnani.tts import GnaniTTSClient

client = GnaniTTSClient(api_key="your-api-key")

audio = client.synthesize(
    "नमस्ते, आप कैसे हैं?",
    voice="sia",
)

with open("output.wav", "wb") as f:
    f.write(audio)

Custom Audio Config

Control the sample rate, encoding, and container format of the output audio.

from gnani.tts import GnaniTTSClient, AudioConfig

client = GnaniTTSClient(api_key="your-api-key")

audio = client.synthesize(
    "यह एक टेस्ट है",
    voice="raju",
    audio_config=AudioConfig(
        sample_rate=44100,
        encoding="linear_pcm",
        container="wav",
    ),
)

with open("output.wav", "wb") as f:
    f.write(audio)

List Available Voices

from gnani.tts import GnaniTTSClient

voices = GnaniTTSClient.supported_voices()
print(voices)

Headers

X-API-Key-ID

string

required

Body

application/json

Request body for TTS inference.

text

string

required

model

enum<string>

required

Supported TTS models.

Available options:

vachana-voice-v2

audio_config

AudioConfig · object

required

Audio output configuration.

Show child attributes

voice

enum<string>

ID of a pre-defined voice. Ignored if speaker_embedding is provided.

Available options:

sia,

raju,

kanika,

nikita,

ravan,

simran,

karan,

neha

Response

Successful audio synthesis

The response is of type file.

Speech-to-Text (Batch)Text-to-Speech (Streaming)

Vachana

Speech-to-Text

Text-to-Speech

Voice Cloning

Text-to-Speech (REST)

Overview

Python SDK

Installation

Authentication

Synthesize Speech

Custom Audio Config

List Available Voices

Headers

Body

Response

Vachana

Speech-to-Text

Text-to-Speech

Voice Cloning

Documentation Index

​Overview

​Python SDK

​Installation

​Authentication

​Synthesize Speech

​Custom Audio Config

​List Available Voices

Headers

Body

Response

Overview

Python SDK

Installation

Authentication

Synthesize Speech

Custom Audio Config

List Available Voices