Introduction - Inya Docs

Available APIs
Speech-to-Text
Text-to-Speech
Voice Cloning
Key Capabilities
Get Started

Available APIs

Speech-to-Text

API	Description
STT REST	Transcribe short audio files (≤ 60s) via a single HTTP request
STT Realtime	Stream live audio over a WebSocket connection and receive transcript segments in real-time

Text-to-Speech

API	Description
TTS REST	Synthesize text to audio in a single synchronous HTTP call
TTS Streaming	Submit text via an HTTP request and receive synthesized audio progressively as a server-sent event stream
TTS Realtime	Stream text incrementally and receive audio simultaneously over a persistent WebSocket connection, delivering low latency

Voice Cloning

API	Description
VC Embeddings	Upload a reference audio file to generate a `speaker_embedding` for use in voice cloning
Voice Cloned TTS REST	Synthesize audio in your cloned voice via a single synchronous HTTP call
Voice Cloned TTS Streaming	Stream cloned voice audio progressively using Server-Sent Events
Voice Cloned TTS Realtime	Stream text and receive cloned voice audio in real-time over a WebSocket connection

Key Capabilities

Feature	Detail
10+ Indian Languages	Native script transcription and synthesis across 10+ Indian languages
Language Detection	Automatic — or specify `language_code` to target a specific language.
Code-Switching	Handles code-mixed speech naturally
Audio Flexibility	STT accepts WAV, MP3, OGG, FLAC, AAC, M4A
Voice Cloning	Clone any voice from a short audio sample using speaker embeddings

Get Started

Ready to begin? Head over to the Quick Start Guide to make your first API call.