Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.inya.ai/llms.txt

Use this file to discover all available pages before exploring further.

Overview

pipecat-gnani is a Pipecat service integration that wraps the Vachana STT and TTS APIs into Pipecat’s standard STTService, TTSService, and InterruptibleTTSService base classes. Drop the services into any Pipecat pipeline and get high-accuracy Indian-language transcription and low-latency synthesis without managing WebSocket connections yourself.
gnani-vachana      ← Core SDK (REST, WebSocket, SSE clients)

pipecat-gnani      ← This package (Pipecat service adapter)

Your Pipecat voice agent
All connection logic, authentication, and audio format handling live in the core SDK. The plugin is purely an adapter layer.

Installation

pip install pipecat-gnani
This also installs gnani-vachana (the core SDK) as a dependency. Requirements: Python 3.10+

Prerequisites

You need a Gnani API key. Email speechstack@gnani.ai to get started — all new accounts receive free credits, no credit card required.
export GNANI_API_KEY="your-api-key"

Services

This plugin provides three service classes. Choose based on your use case:
ServiceTypeTransportBest for
GnaniSTTServiceSTTWebSocketLive conversations, real-time agents
GnaniTTSServiceTTSWebSocket streamingConversational agents with interruption support
GnaniHttpTTSServiceTTSRESTBatch synthesis, non-streaming pipelines

Quick Start

Speech-to-Text

from pipecat_gnani import GnaniSTTService
from pipecat_gnani.language import Language

stt = GnaniSTTService(
    api_key="your-api-key",
    settings=GnaniSTTService.Settings(
        language=Language.HI_IN,
    ),
)
from pipecat_gnani import GnaniTTSService

tts = GnaniTTSService(
    api_key="your-api-key",
    settings=GnaniTTSService.Settings(
        voice="sia",
        language="IND-IN",
    ),
)

Text-to-Speech (REST)

from pipecat_gnani import GnaniHttpTTSService

tts = GnaniHttpTTSService(
    api_key="your-api-key",
    aiohttp_session=session,         # pass your aiohttp.ClientSession here
    settings=GnaniHttpTTSService.Settings(
        voice="sia",
        language="hi-IN",
    ),
)

STT — GnaniSTTService

Real-time streaming speech-to-text via WebSocket with built-in Voice Activity Detection.
  • Connects to wss://api.vachana.ai/stt/v3/stream
  • Sends raw PCM audio in 1,024-byte frames
  • Receives transcript events with segment metadata (text, segment_id, segment_index, latency)
  • Supports 8 kHz and 16 kHz sample rates
from pipecat_gnani import GnaniSTTService
from pipecat_gnani.language import Language

stt = GnaniSTTService(
    api_key="your-api-key",
    settings=GnaniSTTService.Settings(
        language=Language.HI_IN,
        sample_rate=16000,            # 8000 or 16000
    ),
)

Settings

ParameterTypeDefaultDescription
languageLanguageLanguage.EN_INLanguage enum for transcription. See Supported Languages.
sample_rateint16000Audio sample rate in Hz. Accepted values: 8000, 16000.

Streaming text-to-speech via WebSocket. Extends Pipecat’s InterruptibleTTSService, giving your agent built-in interruption (barge-in) support — when the user speaks over the agent, synthesis stops cleanly.
  • Connects to wss://api.vachana.ai/api/v1/tts
  • Streams audio chunks in real-time as synthesis progresses
  • Ideal for live conversational agents where latency and barge-in handling matter
from pipecat_gnani import GnaniTTSService

tts = GnaniTTSService(
    api_key="your-api-key",
    settings=GnaniTTSService.Settings(
        voice="sia",
        language="IND-IN",
    ),
)
GnaniTTSService uses "IND-IN" as its language identifier, not the BCP-47 codes used elsewhere. This is a Vachana WebSocket TTS protocol detail — the language selection is driven primarily by the voice parameter.

Settings

ParameterTypeDefaultDescription
voicestring"sia"Voice ID. See Available Voices.
languagestring"IND-IN"Language identifier for the WebSocket TTS protocol.
sample_rateint16000Output sample rate in Hz.

TTS — GnaniHttpTTSService (REST)

REST-based text-to-speech for non-streaming use cases. Returns the complete audio in a single response.
  • Calls POST /api/v1/tts/inference
  • Requires an active aiohttp.ClientSession passed at construction time
  • Suitable for batch synthesis or pipelines where streaming is not needed
import aiohttp
from pipecat_gnani import GnaniHttpTTSService

async def build_pipeline():
    async with aiohttp.ClientSession() as session:
        tts = GnaniHttpTTSService(
            api_key="your-api-key",
            aiohttp_session=session,
            settings=GnaniHttpTTSService.Settings(
                voice="sia",
                language="hi-IN",
            ),
        )

Settings

ParameterTypeDefaultDescription
voicestring"sia"Voice ID. See Available Voices.
languagestring"hi-IN"BCP-47 language code. See Supported Languages.
sample_rateint16000Output sample rate in Hz.

Available Voices

VoiceID
Siasia
Rajuraju
Kanikakanika
Nikitanikita
Ravanravan
Simransimran
Karankaran
Nehaneha

Supported Languages

LanguageCode (Language enum)BCP-47 string
BengaliLanguage.BN_INbn-IN
English (India)Language.EN_INen-IN
GujaratiLanguage.GU_INgu-IN
HindiLanguage.HI_INhi-IN
KannadaLanguage.KN_INkn-IN
MalayalamLanguage.ML_INml-IN
MarathiLanguage.MR_INmr-IN
PunjabiLanguage.PA_INpa-IN
TamilLanguage.TA_INta-IN
TeluguLanguage.TE_INte-IN
Use the Language enum for GnaniSTTService. Use the BCP-47 string for GnaniHttpTTSService. GnaniTTSService uses "IND-IN" regardless of language — voice selection handles language implicitly.

Further Reading