Documentation Index
Fetch the complete documentation index at: https://docs.inya.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
pipecat-gnani is a Pipecat service integration that wraps the Vachana STT and TTS APIs into Pipecat’s standard STTService, TTSService, and InterruptibleTTSService base classes. Drop the services into any Pipecat pipeline and get high-accuracy Indian-language transcription and low-latency synthesis without managing WebSocket connections yourself.
gnani-vachana ← Core SDK (REST, WebSocket, SSE clients)
↑
pipecat-gnani ← This package (Pipecat service adapter)
↑
Your Pipecat voice agent
All connection logic, authentication, and audio format handling live in the core SDK. The plugin is purely an adapter layer.
Installation
pip install pipecat-gnani
This also installs gnani-vachana (the core SDK) as a dependency.
Requirements: Python 3.10+
Prerequisites
You need a Gnani API key. Email speechstack@gnani.ai to get started — all new accounts receive free credits, no credit card required.
export GNANI_API_KEY="your-api-key"
Services
This plugin provides three service classes. Choose based on your use case:
| Service | Type | Transport | Best for |
|---|
GnaniSTTService | STT | WebSocket | Live conversations, real-time agents |
GnaniTTSService | TTS | WebSocket streaming | Conversational agents with interruption support |
GnaniHttpTTSService | TTS | REST | Batch synthesis, non-streaming pipelines |
Quick Start
Speech-to-Text
from pipecat_gnani import GnaniSTTService
from pipecat_gnani.language import Language
stt = GnaniSTTService(
api_key="your-api-key",
settings=GnaniSTTService.Settings(
language=Language.HI_IN,
),
)
Text-to-Speech (WebSocket streaming — recommended)
from pipecat_gnani import GnaniTTSService
tts = GnaniTTSService(
api_key="your-api-key",
settings=GnaniTTSService.Settings(
voice="sia",
language="IND-IN",
),
)
Text-to-Speech (REST)
from pipecat_gnani import GnaniHttpTTSService
tts = GnaniHttpTTSService(
api_key="your-api-key",
aiohttp_session=session, # pass your aiohttp.ClientSession here
settings=GnaniHttpTTSService.Settings(
voice="sia",
language="hi-IN",
),
)
STT — GnaniSTTService
Real-time streaming speech-to-text via WebSocket with built-in Voice Activity Detection.
- Connects to
wss://api.vachana.ai/stt/v3/stream
- Sends raw PCM audio in 1,024-byte frames
- Receives transcript events with segment metadata (
text, segment_id, segment_index, latency)
- Supports 8 kHz and 16 kHz sample rates
from pipecat_gnani import GnaniSTTService
from pipecat_gnani.language import Language
stt = GnaniSTTService(
api_key="your-api-key",
settings=GnaniSTTService.Settings(
language=Language.HI_IN,
sample_rate=16000, # 8000 or 16000
),
)
Settings
| Parameter | Type | Default | Description |
|---|
language | Language | Language.EN_IN | Language enum for transcription. See Supported Languages. |
sample_rate | int | 16000 | Audio sample rate in Hz. Accepted values: 8000, 16000. |
TTS — GnaniTTSService (WebSocket, recommended)
Streaming text-to-speech via WebSocket. Extends Pipecat’s InterruptibleTTSService, giving your agent built-in interruption (barge-in) support — when the user speaks over the agent, synthesis stops cleanly.
- Connects to
wss://api.vachana.ai/api/v1/tts
- Streams audio chunks in real-time as synthesis progresses
- Ideal for live conversational agents where latency and barge-in handling matter
from pipecat_gnani import GnaniTTSService
tts = GnaniTTSService(
api_key="your-api-key",
settings=GnaniTTSService.Settings(
voice="sia",
language="IND-IN",
),
)
GnaniTTSService uses "IND-IN" as its language identifier, not the BCP-47 codes used elsewhere. This is a Vachana WebSocket TTS protocol detail — the language selection is driven primarily by the voice parameter.
Settings
| Parameter | Type | Default | Description |
|---|
voice | string | "sia" | Voice ID. See Available Voices. |
language | string | "IND-IN" | Language identifier for the WebSocket TTS protocol. |
sample_rate | int | 16000 | Output sample rate in Hz. |
TTS — GnaniHttpTTSService (REST)
REST-based text-to-speech for non-streaming use cases. Returns the complete audio in a single response.
- Calls
POST /api/v1/tts/inference
- Requires an active
aiohttp.ClientSession passed at construction time
- Suitable for batch synthesis or pipelines where streaming is not needed
import aiohttp
from pipecat_gnani import GnaniHttpTTSService
async def build_pipeline():
async with aiohttp.ClientSession() as session:
tts = GnaniHttpTTSService(
api_key="your-api-key",
aiohttp_session=session,
settings=GnaniHttpTTSService.Settings(
voice="sia",
language="hi-IN",
),
)
Settings
| Parameter | Type | Default | Description |
|---|
voice | string | "sia" | Voice ID. See Available Voices. |
language | string | "hi-IN" | BCP-47 language code. See Supported Languages. |
sample_rate | int | 16000 | Output sample rate in Hz. |
Available Voices
| Voice | ID |
|---|
| Sia | sia |
| Raju | raju |
| Kanika | kanika |
| Nikita | nikita |
| Ravan | ravan |
| Simran | simran |
| Karan | karan |
| Neha | neha |
Supported Languages
| Language | Code (Language enum) | BCP-47 string |
|---|
| Bengali | Language.BN_IN | bn-IN |
| English (India) | Language.EN_IN | en-IN |
| Gujarati | Language.GU_IN | gu-IN |
| Hindi | Language.HI_IN | hi-IN |
| Kannada | Language.KN_IN | kn-IN |
| Malayalam | Language.ML_IN | ml-IN |
| Marathi | Language.MR_IN | mr-IN |
| Punjabi | Language.PA_IN | pa-IN |
| Tamil | Language.TA_IN | ta-IN |
| Telugu | Language.TE_IN | te-IN |
Use the Language enum for GnaniSTTService. Use the BCP-47 string for GnaniHttpTTSService. GnaniTTSService uses "IND-IN" regardless of language — voice selection handles language implicitly.
Further Reading