Speech-to-Text (REST)

Overview

The REST endpoint transcribes audio files up to 60 seconds (Ideal duration is 30 seconds) in a single synchronous response. Ideal for batch processing or when you have pre-recorded audio. For real-time transcription, see STT Realtime.

Language Codes

The Vachana API supports these 10 Indian languages

Language	Code	Native Script	Example Text
Bengali	`bn-IN`	Bengali (বাংলা)	“আমি ভাত খাই”
English	`en-IN`	Latin	”I am going to the market”
Gujarati	`gu-IN`	Gujarati (ગુજરાતી)	“હું બજાર જાઉં છું”
Hindi	`hi-IN`	Devanagari (हिन्दी)	“मैं बाज़ार जा रहा हूँ”
Kannada	`kn-IN`	Kannada (ಕನ್ನಡ)	“ನಾನು ಮಾರುಕಟ್ಟೆಗೆ ಹೋಗುತ್ತೇನೆ”
Malayalam	`ml-IN`	Malayalam (മലയാളം)	“ഞാൻ ചന്തയിലേക്ക് പോകുന്നു”
Marathi	`mr-IN`	Devanagari (मराठी)	“मी बाजारात जातोय”
Punjabi	`pa-IN`	Gurmukhi (ਪੰਜਾਬੀ)	“ਮੈਂ ਬਾਜ਼ਾਰ ਜਾ ਰਿਹਾ ਹਾਂ”
Tamil	`ta-IN`	Tamil (தமிழ்)	“நான் சந்தைக்கு செல்கிறேன்”
Telugu	`te-IN`	Telugu (తెలుగు)	“నేను మార్కెట్‌కి వెళ్తున్నాను”
Hinglish(Latin) (experimental)	`en-hi-IN-latn`	Latin	”Main market ja raha hu”
Hinglish (experimental)	`en-hi-in-cm`	Latin + Devanagari (हिन्दी)	“मैं market जा रहा हूँ”
Auto-detect (experimental)	`en-IN`,`hi-IN`,`ta-IN`,`te-IN`,`kn-IN`,`ml-IN`,`gu-IN`,`mr-IN`,`bn-IN`,`pa-IN`	All supported	Automatically detects language

Python SDK

The official Python SDK lets you transcribe audio with a few lines of code, without manually constructing multipart requests or handling HTTP headers.

Installation

pip install gnani-vachana

Requires Python 3.9+.

Authentication

The REST client requires three credentials — your organization_id, api_key, and user_id. You can pass them directly or load them from environment variables.

from gnani.stt import GnaniSTTClient

client = GnaniSTTClient(
    organization_id="your-organization-id",
    api_key="your-api-key",
    user_id="your-user-id",
)

Transcribe Audio

result = client.transcribe("recording.wav", language_code="hi-IN")
print(result["transcript"])

Custom Request ID

Useful for correlating SDK calls with your own logs or support tickets.

result = client.transcribe(
    "call.flac",
    language_code="hi-IN",
    request_id="my-trace-123",
)

Error Handling

from gnani.stt import (
    AuthenticationError,
    InvalidAudioError,
    APIError,
)

try:
    result = client.transcribe("audio.wav", language_code="hi-IN")
    print(result["transcript"])
except AuthenticationError:
    print("Invalid credentials — check your organization_id, api_key, and user_id.")
except InvalidAudioError as e:
    print(f"Bad audio file: {e}")
except APIError as e:
    print(f"API error {e.status_code}: {e}")

Authorizations

X-API-Key-ID

string

header

required

API key for authentication. Sign up in Vachana to get the API Key.

Body

multipart/form-data

audio_file

file

required

Audio file to transcribe. Supported formats - WAV, MP3, OGG, FLAC, AAC, M4A. Maximum duration - 60 seconds (Ideal duration is 30 seconds).

language_code

enum<string>

required

Language code for transcription. Use one of the supported language codes.

Supported values: bn-IN, en-IN, gu-IN, hi-IN, kn-IN, ml-IN, mr-IN, pa-IN, ta-IN, te-IN, en-hi-IN-latn

Available options:

bn-IN,

en-IN,

gu-IN,

hi-IN,

kn-IN,

ml-IN,

mr-IN,

pa-IN,

ta-IN,

te-IN,

en-hi-IN-latn

Example:

"hi-IN"

preferred_language

enum<string>

Optional preferred language for processing when multiple languages are specified. Must be one of the languages in language_code. When set, forces processing with the single-language model for the specified language, which may improve accuracy for predominantly single-language audio.

Available options:

bn-IN,

en-IN,

gu-IN,

hi-IN,

kn-IN,

ml-IN,

mr-IN,

pa-IN,

ta-IN,

te-IN,

en-hi-IN-latn

Example:

"hi-IN"

Response

Successful transcription

success

boolean

Indicates if the transcription was successful

timestamp

string

Request timestamp in format YYYYMMDD_HHMMSS.mmm

transcript

string

The transcribed text from the audio

Vachana

Speech-to-Text

Text-to-Speech

Voice Cloning

Speech-to-Text (REST)

Overview

Language Codes

Python SDK

Installation

Authentication

Transcribe Audio

Custom Request ID

Error Handling

Authorizations

Body

Response

Vachana

Speech-to-Text

Text-to-Speech

Voice Cloning

Documentation Index

​Overview

​Language Codes

​Python SDK

​Installation

​Authentication

​Transcribe Audio

​Custom Request ID

​Error Handling

Authorizations

Body

Response

Overview

Language Codes

Python SDK

Installation

Authentication

Transcribe Audio

Custom Request ID

Error Handling