Skip to main content
POST
/
stt
/
v3
Speech to Text (REST)
curl --request POST \
  --url https://api.vachana.ai/stt/v3 \
  --header 'Content-Type: multipart/form-data' \
  --header 'X-API-Key-ID: <api-key>' \
  --form audio_file='@example-file' \
  --form language_code=hi-IN
{
  "success": true,
  "request_id": "req_abc123",
  "timestamp": "20251226_143052.123",
  "transcript": "नमस्ते, आप कैसे हैं?"
}

Language Codes

The Vachana API supports these 10 Indian languages

LanguageCodeNative ScriptExample Text
Bengalibn-INBengali (বাংলা)“আমি ভাত খাই”
Englishen-INLatin”I am going to the market”
Gujaratigu-INGujarati (ગુજરાતી)“હું બજાર જાઉં છું”
Hindihi-INDevanagari (हिन्दी)“मैं बाज़ार जा रहा हूँ”
Kannadakn-INKannada (ಕನ್ನಡ)“ನಾನು ಮಾರುಕಟ್ಟೆಗೆ ಹೋಗುತ್ತೇನೆ”
Malayalamml-INMalayalam (മലയാളം)“ഞാൻ ചന്തയിലേക്ക് പോകുന്നു”
Marathimr-INDevanagari (मराठी)“मी बाजारात जातोय”
Punjabipa-INGurmukhi (ਪੰਜਾਬੀ)“ਮੈਂ ਬਾਜ਼ਾਰ ਜਾ ਰਿਹਾ ਹਾਂ”
Tamilta-INTamil (தமிழ்)“நான் சந்தைக்கு செல்கிறேன்”
Telugute-INTelugu (తెలుగు)“నేను మార్కెట్‌కి వెళ్తున్నాను”
Auto-detecten-IN,hi-IN,ta-IN,te-IN,kn-IN,ml-IN,gu-IN,mr-IN,bn-IN,pa-INAll supportedAutomatically detects language

Auto-detect Mode: To enable automatic language detection across all supported languages, pass all language codes as a comma-separated list.

Authorizations

X-API-Key-ID
string
header
required

API key for authentication. Contact Gnani.ai to obtain your API key.

Headers

X-API-Request-ID
string

Unique request ID for tracking and logging.

Example:

"req_abc123"

Body

multipart/form-data
audio_file
file
required

Audio file to transcribe. Supported formats - WAV, MP3, OGG, FLAC, AAC, M4A. Maximum duration - 30 seconds.

language_code
string
required

Language code for transcription. Use one of the supported language codes.

Supported values: bn-IN, en-IN, gu-IN, hi-IN, kn-IN, ml-IN, mr-IN, pa-IN, ta-IN, te-IN

For multilingual transcription, use comma-separated values (e.g., en-IN,hi-IN).

Example:

"hi-IN"

preferred_language
enum<string>

Optional preferred language for processing when multiple languages are specified. Must be one of the languages in language_code. When set, the monolingual model for this language will be used.

Available options:
bn-IN,
en-IN,
gu-IN,
hi-IN,
kn-IN,
ml-IN,
mr-IN,
pa-IN,
ta-IN,
te-IN
Example:

"hi-IN"

Response

Successful transcription

success
boolean

Indicates if the transcription was successful

request_id
string

Unique identifier for this request

timestamp
string

Request timestamp in format YYYYMMDD_HHMMSS.mmm

transcript
string

The transcribed text from the audio