Pipecat Plugin - Inya Docs

Overview

pipecat-gnani is a Pipecat service integration that wraps the Vachana STT and TTS APIs into Pipecat’s standard STTService, TTSService, and InterruptibleTTSService base classes. Drop the services into any Pipecat pipeline and get high-accuracy Indian-language transcription and low-latency synthesis without managing WebSocket connections yourself.

gnani-vachana      ← Core SDK (REST, WebSocket, SSE clients)
    ↑
pipecat-gnani      ← This package (Pipecat service adapter)
    ↑
Your Pipecat voice agent

All connection logic, authentication, and audio format handling live in the core SDK. The plugin is purely an adapter layer.

Installation

pip install pipecat-gnani

This also installs gnani-vachana (the core SDK) as a dependency. Requirements: Python 3.10+

Prerequisites

You need a Gnani API key. Email speechstack@gnani.ai to get started — all new accounts receive free credits, no credit card required.

export GNANI_API_KEY="your-api-key"

Services

This plugin provides three service classes. Choose based on your use case:

Service	Type	Transport	Best for
`GnaniSTTService`	STT	WebSocket	Live conversations, real-time agents
`GnaniTTSService`	TTS	WebSocket streaming	Conversational agents with interruption support
`GnaniHttpTTSService`	TTS	REST	Batch synthesis, non-streaming pipelines

Quick Start

Speech-to-Text

from pipecat_gnani import GnaniSTTService
from pipecat_gnani.language import Language

stt = GnaniSTTService(
    api_key="your-api-key",
    settings=GnaniSTTService.Settings(
        language=Language.HI_IN,
    ),
)

Text-to-Speech (WebSocket streaming — recommended)

from pipecat_gnani import GnaniTTSService

tts = GnaniTTSService(
    api_key="your-api-key",
    settings=GnaniTTSService.Settings(
        voice="sia",
        language="IND-IN",
    ),
)

Text-to-Speech (REST)

from pipecat_gnani import GnaniHttpTTSService

tts = GnaniHttpTTSService(
    api_key="your-api-key",
    aiohttp_session=session,         # pass your aiohttp.ClientSession here
    settings=GnaniHttpTTSService.Settings(
        voice="sia",
        language="hi-IN",
    ),
)

STT — `GnaniSTTService`

Real-time streaming speech-to-text via WebSocket with built-in Voice Activity Detection.

Connects to wss://api.vachana.ai/stt/v3/stream
Sends raw PCM audio in 1,024-byte frames
Receives transcript events with segment metadata (text, segment_id, segment_index, latency)
Supports 8 kHz and 16 kHz sample rates

from pipecat_gnani import GnaniSTTService
from pipecat_gnani.language import Language

stt = GnaniSTTService(
    api_key="your-api-key",
    settings=GnaniSTTService.Settings(
        language=Language.HI_IN,
        sample_rate=16000,            # 8000 or 16000
    ),
)

Settings

Parameter	Type	Default	Description
`language`	`Language`	`Language.EN_IN`	Language enum for transcription. See Supported Languages.
`sample_rate`	`int`	`16000`	Audio sample rate in Hz. Accepted values: `8000`, `16000`.

TTS — `GnaniTTSService` (WebSocket, recommended)

Streaming text-to-speech via WebSocket. Extends Pipecat’s InterruptibleTTSService, giving your agent built-in interruption (barge-in) support — when the user speaks over the agent, synthesis stops cleanly.

Connects to wss://api.vachana.ai/api/v1/tts
Streams audio chunks in real-time as synthesis progresses
Ideal for live conversational agents where latency and barge-in handling matter

from pipecat_gnani import GnaniTTSService

tts = GnaniTTSService(
    api_key="your-api-key",
    settings=GnaniTTSService.Settings(
        voice="sia",
        language="IND-IN",
    ),
)

GnaniTTSService uses "IND-IN" as its language identifier, not the BCP-47 codes used elsewhere. This is a Vachana WebSocket TTS protocol detail — the language selection is driven primarily by the voice parameter.

Settings

Parameter	Type	Default	Description
`voice`	`string`	`"sia"`	Voice ID. See Available Voices.
`language`	`string`	`"IND-IN"`	Language identifier for the WebSocket TTS protocol.
`sample_rate`	`int`	`16000`	Output sample rate in Hz.

TTS — `GnaniHttpTTSService` (REST)

REST-based text-to-speech for non-streaming use cases. Returns the complete audio in a single response.

Calls POST /api/v1/tts/inference
Requires an active aiohttp.ClientSession passed at construction time
Suitable for batch synthesis or pipelines where streaming is not needed

import aiohttp
from pipecat_gnani import GnaniHttpTTSService

async def build_pipeline():
    async with aiohttp.ClientSession() as session:
        tts = GnaniHttpTTSService(
            api_key="your-api-key",
            aiohttp_session=session,
            settings=GnaniHttpTTSService.Settings(
                voice="sia",
                language="hi-IN",
            ),
        )

Settings

Parameter	Type	Default	Description
`voice`	`string`	`"sia"`	Voice ID. See Available Voices.
`language`	`string`	`"hi-IN"`	BCP-47 language code. See Supported Languages.
`sample_rate`	`int`	`16000`	Output sample rate in Hz.

Available Voices

Voice	ID
Sia	`sia`
Raju	`raju`
Kanika	`kanika`
Nikita	`nikita`
Ravan	`ravan`
Simran	`simran`
Karan	`karan`
Neha	`neha`

Supported Languages

Language	Code (`Language` enum)	BCP-47 string
Bengali	`Language.BN_IN`	`bn-IN`
English (India)	`Language.EN_IN`	`en-IN`
Gujarati	`Language.GU_IN`	`gu-IN`
Hindi	`Language.HI_IN`	`hi-IN`
Kannada	`Language.KN_IN`	`kn-IN`
Malayalam	`Language.ML_IN`	`ml-IN`
Marathi	`Language.MR_IN`	`mr-IN`
Punjabi	`Language.PA_IN`	`pa-IN`
Tamil	`Language.TA_IN`	`ta-IN`
Telugu	`Language.TE_IN`	`te-IN`

Use the Language enum for GnaniSTTService. Use the BCP-47 string for GnaniHttpTTSService. GnaniTTSService uses "IND-IN" regardless of language — voice selection handles language implicitly.

Documentation Index

​Overview

​Installation

​Prerequisites

​Services

​Quick Start

​Speech-to-Text

​Text-to-Speech (WebSocket streaming — recommended)

​Text-to-Speech (REST)

​STT — GnaniSTTService

​Settings

​TTS — GnaniTTSService (WebSocket, recommended)

​Settings

​TTS — GnaniHttpTTSService (REST)

​Settings

​Available Voices

​Supported Languages

​Further Reading

Overview

Installation

Prerequisites

Services

Quick Start

Speech-to-Text

Text-to-Speech (WebSocket streaming — recommended)

Text-to-Speech (REST)

STT — `GnaniSTTService`

Settings

TTS — `GnaniTTSService` (WebSocket, recommended)

Settings

TTS — `GnaniHttpTTSService` (REST)

Settings

Available Voices

Supported Languages

Further Reading