Call Analytics Pipeline

Overview

Every customer call contains signal that most teams never act on. This pipeline surfaces that signal automatically — agent effectiveness, customer sentiment, resolution quality, and upsell opportunities — across any volume of calls, in 10 Indian languages.

Industry	What the pipeline enables
BFSI / Collections	Monitor agent compliance, detect customer distress early, flag missed EMI restructuring opportunities.
Insurance	Analyze claim support calls, track resolution rates, identify policy renewal signals.
Contact Centers / BPOs	Automate QA at scale, reduce manual call review, improve agent training with structured feedback.
Healthcare	Analyze patient support calls, surface unresolved queries, track sentiment across touchpoints.
Telecom	Detect churn signals, identify upsell triggers, monitor service complaint patterns.

Native sentiment — no extra inference cost. The Vachana Batch STT API returns sentiment and emotion per segment directly from the transcription layer. You get speaker-wise sentiment timelines without relying solely on the LLM analysis step.

Prerequisites & Installation

# Gnani Vachana SDK
pip install gnani-vachana

# Audio chunking (required for files over 1 hour)
pip install pydub

# Install whichever LLM provider you intend to use
pip install anthropic       # Claude
pip install openai          # OpenAI / ChatGPT

ffmpeg required for non-WAV formats. pydub needs ffmpeg to process MP3, AAC, M4A, and OGG files. Install it with brew install ffmpeg on macOS or apt install ffmpeg on Linux.

Authentication

Every request to the Batch STT API requires the X-API-Key-ID header. Store all credentials in environment variables.

Header	Required	Description
`X-API-Key-ID`	Yes	Your Vachana API key. Required on every request — both submit and status calls.
`X-API-Request-ID`	No	A UUID trace ID you assign. Used to correlate your logs with platform logs or support tickets.

.env

# Vachana
GNANI_API_KEY=your-api-key

# LLM provider — set whichever you will use
ANTHROPIC_API_KEY=your-anthropic-key
OPENAI_API_KEY=your-openai-key

# Switch between "claude" and "openai"
LLM_PROVIDER=claude

Never hardcode API keys. Do not commit API keys to version control. Use environment variables, a secrets manager, or a vault. Rotate keys immediately if exposed.

Supported Languages

Language	Code	Script	ITN Support
Hindi	`hi-IN`	Devanagari	Yes
English	`en-IN`	Latin	Yes
Tamil	`ta-IN`	Tamil	—
Telugu	`te-IN`	Telugu	—
Kannada	`kn-IN`	Kannada	—
Malayalam	`ml-IN`	Malayalam	—
Bengali	`bn-IN`	Bengali	—
Gujarati	`gu-IN`	Gujarati	—
Marathi	`mr-IN`	Devanagari	—
Punjabi	`pa-IN`	Gurmukhi	—
Hinglish	`en-hi-in-cm`	Latin + Devanagari	Experimental

ITN converts spoken-form numbers, currency, dates, and phone numbers into written form — for example, “five thousand rupees” becomes “₹5,000”. Set format=transcribe in the request to enable it.

Batch API Flow

Operation	Method	Endpoint
Submit job	`POST`	`https://api.vachana.ai/stt/v3/batch/submit`
Check status	`GET`	`https://api.vachana.ai/stt/v3/batch/status/{job_id}`

Submit — POST /stt/v3/batch/submit

Upload 1–10 audio files as multipart form data with your language code and format preference. Receive a job_id immediately. Transcription has not started at this point.

Save the job_id

Persist the job_id from the submit response. It is required for every subsequent status call.

Poll — GET /stt/v3/batch/status/{job_id}

Call the status endpoint every 60 seconds. Status transitions: submitted → processing → completed or failed. The results field is null until the job reaches completed.

Parse segments

Extract per-segment fields: speaker_id, text, start_time, end_time, sentiment, emotion, confidence. Build speaker-separated conversation threads and talk-time logs.

Analyze with an LLM

Send the parsed transcript to Claude or OpenAI with a structured analysis prompt. Save the output — analysis, Q&A answers, and batch summary — to the outputs directory.

Minimum poll interval: 60 seconds. The API enforces a 60-second minimum between status calls for the same job_id. Do not reduce this value.

Pipeline Implementation

Imports & Setup

imports and config

import os, json, time, hashlib, requests
from pathlib  import Path
from datetime import datetime
from typing   import List, Dict, Optional
from pydub    import AudioSegment

try:
    import anthropic
except ImportError:
    anthropic = None

try:
    from openai import OpenAI
except ImportError:
    OpenAI = None

OUTPUT_DIR    = "outputs"
BATCH_SUBMIT  = "https://api.vachana.ai/stt/v3/batch/submit"
BATCH_STATUS  = "https://api.vachana.ai/stt/v3/batch/status/{job_id}"
POLL_INTERVAL = 60   # seconds — minimum enforced by the API

LLM_PROVIDER = os.getenv("LLM_PROVIDER", "claude")

Path(OUTPUT_DIR).mkdir(exist_ok=True)


def split_audio(audio_path: str, chunk_ms: int = 3_600_000) -> List[AudioSegment]:
    """Split audio into chunks of at most 1 hour for Batch API compliance."""
    audio = AudioSegment.from_file(audio_path)
    if len(audio) <= chunk_ms:
        return [audio]
    return [audio[i:i + chunk_ms] for i in range(0, len(audio), chunk_ms)]

Submit & Poll

process_audio_files + _poll_until_complete

def process_audio_files(
    self,
    audio_paths: List[str],
    language_code: str = "hi-IN",
    itn: bool = True,
) -> Dict[str, dict]:
    """Submit audio files to Vachana Batch STT and poll until complete."""
    if not audio_paths:
        return {}

    files = [
        ("audio_files", (Path(p).name, open(p, "rb"), "audio/wav"))
        for p in audio_paths
    ]
    data = {
        "language_code":   language_code,
        "is_multi_channel": "false",
        "format":          "transcribe" if itn else "verbatim",
    }

    resp = requests.post(BATCH_SUBMIT, headers=self.headers, files=files, data=data)
    resp.raise_for_status()
    job_id = resp.json()["job_id"]
    print(f"Job submitted: {job_id}")

    for _, (_, fh, _) in files:
        fh.close()

    results = self._poll_until_complete(job_id)
    if not results:
        return {}

    output_dir = Path(OUTPUT_DIR) / f"job_{job_id}"
    output_dir.mkdir(parents=True, exist_ok=True)

    transcriptions = self._parse_results(results, output_dir)
    self.transcriptions.update(transcriptions)
    print(f"Transcribed {len(transcriptions)} file(s).")

    for fname, d in transcriptions.items():
        self.analyze_transcription(d["conversation_path"], output_dir, fname)

    return transcriptions


def _poll_until_complete(self, job_id: str) -> Optional[list]:
    """Poll the status endpoint every 60 s until a terminal state is reached."""
    url = BATCH_STATUS.format(job_id=job_id)
    print("Polling for results (every 60 s)...")
    while True:
        time.sleep(POLL_INTERVAL)
        r = requests.get(url, headers=self.headers)
        r.raise_for_status()
        payload = r.json()
        status  = payload["status"]
        print(f"  status={status}  progress={payload.get('overall_progress', '–')}%")
        if status == "completed":
            return payload.get("results", [])
        if status == "failed":
            print(f"Job failed: {payload.get('error')}")
            return None

Parsing — Speaker Transcripts & Sentiment Timeline

_parse_results processes the completed API response and writes three output files per call: a speaker-labelled transcript, a per-speaker talk-time log, and a segment-level sentiment timeline.

_parse_results

def _parse_results(self, results: list, output_dir: Path) -> Dict[str, dict]:
    """Parse per-file segment data into conversation and analytics files."""
    transcriptions = {}

    for file_result in results:
        fname    = Path(file_result["filename"]).stem
        segments = file_result.get("segments", [])

        if not segments:
            print(f"No segments returned for {fname}, skipping.")
            continue

        lines, speaker_times, sentiment_log = [], {}, []

        for seg in segments:
            spk  = seg.get("speaker_id", "UNKNOWN")
            text = seg.get("text", "").strip()
            s    = seg.get("start_time", 0.0)
            e    = seg.get("end_time",   0.0)

            lines.append(f"SPEAKER_{spk}: {text}")
            speaker_times[spk] = speaker_times.get(spk, 0.0) + (e - s)
            sentiment_log.append({
                "speaker":    spk,
                "start_time": s,
                "text":       text,
                "sentiment":  seg.get("sentiment", "Neutral"),
                "emotion":    seg.get("emotion",   "Neutral"),
            })

        conv_path      = output_dir / f"{fname}_conversation.txt"
        timing_path    = output_dir / f"{fname}_timing.json"
        sentiment_path = output_dir / f"{fname}_sentiment.json"

        conv_path.write_text("\n".join(lines), encoding="utf-8")
        timing_path.write_text(json.dumps(speaker_times, indent=2), encoding="utf-8")
        sentiment_path.write_text(json.dumps(sentiment_log, indent=2), encoding="utf-8")

        transcriptions[fname] = {
            "conversation_path": str(conv_path),
            "timing_path":       str(timing_path),
            "sentiment_path":    str(sentiment_path),
        }

    return transcriptions

Files produced per call: {name}_conversation.txt — speaker-labelled transcript · {name}_timing.json — talk time per speaker in seconds · {name}_sentiment.json — segment-level sentiment and emotion timeline

LLM Analysis

The analysis step sends the parsed conversation to your chosen LLM with a structured prompt. Switch providers by changing the LLM_PROVIDER environment variable.

analysis prompt

ANALYSIS_PROMPT = """
Analyze this call transcription from start to finish.

TRANSCRIPTION:
{transcription}

Provide a structured response covering each of the following:

1. Speaker identification — which speaker is the customer, which is the agent?
2. Customer type — new/potential customer or existing customer?
3. Opening problem — what issue or query did the customer raise initially?
4. Products or services — what was the customer inquiring about or facing issues with?
5. Agent response — how did the agent handle and resolve the issue throughout the call?
6. Resolution outcome — was the issue resolved? Was the customer satisfied at the end?
7. Sentiment arc — how did the customer's sentiment shift across the call?
8. Upsell or cross-sell signals — any opportunities the agent identified or missed?
9. Competitor mentions — were any competitors referenced?
10. Summary — two-sentence outcome summary.

Note: Segment-level sentiment and emotion are tagged in the transcript where available.
"""

analyze_transcription + _call_llm

def analyze_transcription(self, conversation_path: str, output_dir: Path, file_name: str) -> dict:
    """Run LLM analysis on a parsed conversation file."""
    transcript = Path(conversation_path).read_text(encoding="utf-8")
    analysis   = self._call_llm(
        system="You are a call analytics expert. Provide structured, actionable insights.",
        user=ANALYSIS_PROMPT.format(transcription=transcript),
    )
    out = output_dir / f"{file_name}_analysis.txt"
    out.write_text(analysis.strip(), encoding="utf-8")
    print(f"Analysis saved: {out}")
    return {"file_name": file_name, "analysis_path": str(out)}


def _call_llm(self, system: str, user: str) -> str:
    """Route to Claude or OpenAI based on LLM_PROVIDER env variable."""
    if LLM_PROVIDER == "claude":
        if anthropic is None:
            raise ImportError("Install the anthropic package: pip install anthropic")
        client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
        msg    = client.messages.create(
            model="claude-opus-4-8",
            max_tokens=2000,
            system=system,
            messages=[{"role": "user", "content": user}],
        )
        return msg.content[0].text

    elif LLM_PROVIDER == "openai":
        if OpenAI is None:
            raise ImportError("Install the openai package: pip install openai")
        client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        resp   = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": system},
                {"role": "user",   "content": user},
            ],
        )
        return resp.choices[0].message.content

    raise ValueError(f"Unknown LLM_PROVIDER '{LLM_PROVIDER}'. Set to 'claude' or 'openai'.")

Ad-hoc Q&A

Ask any question against a transcribed call — useful for targeted investigation after bulk processing.

answer_question

def answer_question(self, question: str) -> None:
    """Answer a question for every transcribed call in the current session."""
    for fname, data in self.transcriptions.items():
        transcript = Path(data["conversation_path"]).read_text(encoding="utf-8")
        answer     = self._call_llm(
            system="",
            user=f"TRANSCRIPT:\n{transcript}\n\nQUESTION: {question}",
        )
        q_hash = hashlib.sha1(question.encode()).hexdigest()[:6]
        out    = Path(data["conversation_path"]).parent / f"{fname}_q_{q_hash}.txt"
        out.write_text(f"Q: {question}\n\nA:\n{answer}", encoding="utf-8")
        print(f"Answer saved: {out}")

Summary Report

Generate a single summary report across all analyzed calls in the session.

get_summary

SUMMARY_PROMPT = """
Based on this call analysis, provide a concise 2–3 word answer for each point:

{analysis_text}

1. Customer and Agent
2. Customer Type
3. Main Issue
4. Service Discussed
5. Agent Response Quality
6. Customer Satisfaction
7. Overall Sentiment
8. Competitor or Upsell Signal
9. Resolution Status
"""


def get_summary(self) -> None:
    """Generate a single summary report across all calls in the session."""
    ts  = datetime.now().strftime("%Y%m%d_%H%M%S")
    out = Path(OUTPUT_DIR) / f"summary_{ts}.txt"

    with open(out, "w", encoding="utf-8") as f:
        f.write(f"CALL ANALYTICS SUMMARY\n{'='*60}\n")
        f.write(f"Generated : {datetime.now()}\n")
        f.write(f"Total calls: {len(self.transcriptions)}\n{'='*60}\n\n")

        for fname, data in self.transcriptions.items():
            af = Path(data["conversation_path"]).parent / f"{fname}_analysis.txt"
            if not af.exists():
                print(f"No analysis file found for {fname}, skipping.")
                continue
            summary = self._call_llm(
                system="You are a call analytics expert. Be concise.",
                user=SUMMARY_PROMPT.format(analysis_text=af.read_text(encoding="utf-8")),
            )
            f.write(f"Call: {fname}\n{'-'*30}\n{summary.strip()}\n\n")

    print(f"Summary saved: {out}")

Full Pipeline

call_analytics_pipeline.py

import os, json, time, hashlib, requests
from pathlib  import Path
from datetime import datetime
from typing   import List, Dict, Optional
from pydub    import AudioSegment

try:
    import anthropic
except ImportError:
    anthropic = None

try:
    from openai import OpenAI
except ImportError:
    OpenAI = None

OUTPUT_DIR    = "outputs"
BATCH_SUBMIT  = "https://api.vachana.ai/stt/v3/batch/submit"
BATCH_STATUS  = "https://api.vachana.ai/stt/v3/batch/status/{job_id}"
POLL_INTERVAL = 60
LLM_PROVIDER  = os.getenv("LLM_PROVIDER", "claude")

Path(OUTPUT_DIR).mkdir(exist_ok=True)

ANALYSIS_PROMPT = """
Analyze this call transcription from start to finish.

TRANSCRIPTION:
{transcription}

Provide a structured response covering each of the following:

1. Speaker identification — which speaker is the customer, which is the agent?
2. Customer type — new/potential customer or existing customer?
3. Opening problem — what issue or query did the customer raise initially?
4. Products or services — what was the customer inquiring about or facing issues with?
5. Agent response — how did the agent handle and resolve the issue throughout the call?
6. Resolution outcome — was the issue resolved? Was the customer satisfied at the end?
7. Sentiment arc — how did the customer's sentiment shift across the call?
8. Upsell or cross-sell signals — any opportunities the agent identified or missed?
9. Competitor mentions — were any competitors referenced?
10. Summary — two-sentence outcome summary.
"""

SUMMARY_PROMPT = """
Based on this call analysis, provide a concise 2–3 word answer for each point:

{analysis_text}

1. Customer and Agent
2. Customer Type
3. Main Issue
4. Service Discussed
5. Agent Response Quality
6. Customer Satisfaction
7. Overall Sentiment
8. Competitor or Upsell Signal
9. Resolution Status
"""


def split_audio(audio_path: str, chunk_ms: int = 3_600_000) -> List[AudioSegment]:
    audio = AudioSegment.from_file(audio_path)
    if len(audio) <= chunk_ms:
        return [audio]
    return [audio[i:i + chunk_ms] for i in range(0, len(audio), chunk_ms)]


class CallAnalytics:

    def __init__(self, api_key: str):
        self.headers        = {"X-API-Key-ID": api_key}
        self.transcriptions: Dict[str, dict] = {}

    def process_audio_files(self, audio_paths: List[str], language_code: str = "hi-IN", itn: bool = True) -> Dict[str, dict]:
        if not audio_paths:
            return {}
        files = [("audio_files", (Path(p).name, open(p, "rb"), "audio/wav")) for p in audio_paths]
        data  = {"language_code": language_code, "is_multi_channel": "false",
                  "format": "transcribe" if itn else "verbatim"}
        resp = requests.post(BATCH_SUBMIT, headers=self.headers, files=files, data=data)
        resp.raise_for_status()
        job_id = resp.json()["job_id"]
        print(f"Job submitted: {job_id}")
        for _, (_, fh, _) in files: fh.close()

        results = self._poll_until_complete(job_id)
        if not results: return {}

        output_dir = Path(OUTPUT_DIR) / f"job_{job_id}"
        output_dir.mkdir(parents=True, exist_ok=True)
        transcriptions = self._parse_results(results, output_dir)
        self.transcriptions.update(transcriptions)
        print(f"Transcribed {len(transcriptions)} file(s).")
        for fname, d in transcriptions.items():
            self.analyze_transcription(d["conversation_path"], output_dir, fname)
        return transcriptions

    def _poll_until_complete(self, job_id: str) -> Optional[list]:
        url = BATCH_STATUS.format(job_id=job_id)
        print("Polling every 60 s...")
        while True:
            time.sleep(POLL_INTERVAL)
            r = requests.get(url, headers=self.headers)
            r.raise_for_status()
            p = r.json()
            print(f"  [{p['status']}] {p.get('overall_progress', '–')}%")
            if p["status"] == "completed": return p.get("results", [])
            if p["status"] == "failed":
                print(f"Job failed: {p.get('error')}")
                return None

    def _parse_results(self, results: list, output_dir: Path) -> Dict[str, dict]:
        transcriptions = {}
        for fr in results:
            fname    = Path(fr["filename"]).stem
            segments = fr.get("segments", [])
            if not segments: continue
            lines, speaker_times, sentiment_log = [], {}, []
            for seg in segments:
                spk = seg.get("speaker_id", "UNKNOWN")
                txt = seg.get("text", "").strip()
                s, e = seg.get("start_time", 0.0), seg.get("end_time", 0.0)
                lines.append(f"SPEAKER_{spk}: {txt}")
                speaker_times[spk] = speaker_times.get(spk, 0.0) + (e - s)
                sentiment_log.append({"speaker": spk, "start_time": s, "text": txt,
                    "sentiment": seg.get("sentiment", "Neutral"),
                    "emotion":   seg.get("emotion",   "Neutral")})
            cp = output_dir / f"{fname}_conversation.txt"
            tp = output_dir / f"{fname}_timing.json"
            sp = output_dir / f"{fname}_sentiment.json"
            cp.write_text("\n".join(lines), encoding="utf-8")
            tp.write_text(json.dumps(speaker_times, indent=2), encoding="utf-8")
            sp.write_text(json.dumps(sentiment_log,  indent=2), encoding="utf-8")
            transcriptions[fname] = {"conversation_path": str(cp), "timing_path": str(tp), "sentiment_path": str(sp)}
        return transcriptions

    def _call_llm(self, system: str, user: str) -> str:
        if LLM_PROVIDER == "claude":
            client = anthropic.Anthropic(api_key=os.getenv("ANTHROPIC_API_KEY"))
            msg    = client.messages.create(model="claude-opus-4-8", max_tokens=2000,
                         system=system, messages=[{"role": "user", "content": user}])
            return msg.content[0].text
        elif LLM_PROVIDER == "openai":
            client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
            resp   = client.chat.completions.create(model="gpt-4o",
                         messages=[{"role": "system", "content": system},
                                   {"role": "user",   "content": user}])
            return resp.choices[0].message.content
        raise ValueError(f"Unknown LLM_PROVIDER: {LLM_PROVIDER}")

    def analyze_transcription(self, conversation_path: str, output_dir: Path, fname: str):
        transcript = Path(conversation_path).read_text(encoding="utf-8")
        analysis   = self._call_llm(system="You are a call analytics expert. Provide structured, actionable insights.",
                                     user=ANALYSIS_PROMPT.format(transcription=transcript))
        out = output_dir / f"{fname}_analysis.txt"
        out.write_text(analysis.strip(), encoding="utf-8")
        print(f"Analysis: {out}")

    def answer_question(self, question: str):
        for fname, data in self.transcriptions.items():
            transcript = Path(data["conversation_path"]).read_text(encoding="utf-8")
            answer     = self._call_llm(system="",
                user=f"TRANSCRIPT:\n{transcript}\n\nQUESTION: {question}")
            q_hash = hashlib.sha1(question.encode()).hexdigest()[:6]
            out    = Path(data["conversation_path"]).parent / f"{fname}_q_{q_hash}.txt"
            out.write_text(f"Q: {question}\n\nA:\n{answer}", encoding="utf-8")
            print(f"Q&A: {out}")

    def get_summary(self):
        ts  = datetime.now().strftime("%Y%m%d_%H%M%S")
        out = Path(OUTPUT_DIR) / f"summary_{ts}.txt"
        with open(out, "w", encoding="utf-8") as f:
            f.write(f"CALL ANALYTICS SUMMARY\n{'='*60}\nGenerated: {datetime.now()}\n{'='*60}\n\n")
            for fname, data in self.transcriptions.items():
                af = Path(data["conversation_path"]).parent / f"{fname}_analysis.txt"
                if not af.exists(): continue
                summary = self._call_llm(system="Be concise.",
                    user=SUMMARY_PROMPT.format(analysis_text=af.read_text(encoding="utf-8")))
                f.write(f"Call: {fname}\n{'-'*30}\n{summary.strip()}\n\n")
        print(f"Summary: {out}")


if __name__ == "__main__":
    analytics = CallAnalytics(api_key=os.getenv("GNANI_API_KEY"))

    analytics.process_audio_files(
        audio_paths=["./call_001.wav"],
        language_code="hi-IN",
        itn=True,
    )

    analytics.answer_question("Did the agent offer any EMI or payment extension options?")
    analytics.get_summary()

Sample Output

outputs/
└── job_batch_7f3a92c1d4e8/
    ├── call_001_conversation.txt    ← speaker-labelled transcript
    ├── call_001_timing.json         ← talk time per speaker (seconds)
    ├── call_001_sentiment.json      ← segment-level sentiment timeline
    ├── call_001_analysis.txt        ← LLM structured analysis
    └── call_001_q_a3f9b2.txt        ← ad-hoc Q&A answer
summary_20251226_143052.txt          ← batch summary across all calls

call_001_conversation.txt

SPEAKER_1: नमस्ते, मेरा नाम रोहन है। मेरी EMI अगले हफ्ते due है।
SPEAKER_2: नमस्ते रोहन जी, आपका loan account number बताइए।
SPEAKER_1: हाँ, ₹45,000 की EMI है। क्या मुझे extension मिल सकता है?
SPEAKER_2: आपकी request process करते हैं। 3 दिन का extension approve हो सकता है।
SPEAKER_1: ठीक है, शुक्रिया।

call_001_analysis.txt (excerpt)

1. Speaker Identification
   SPEAKER_1 — Customer (Rohan)
   SPEAKER_2 — Agent

2. Customer Type
   Existing customer with an active loan account and an upcoming EMI.

3. Opening Problem
   Customer called to request an EMI payment extension due to cash flow constraints.

6. Resolution Outcome
   Resolved within the call. Customer expressed satisfaction before closing.

7. Sentiment Arc
   Started neutral-to-anxious. Shifted to relieved after the extension was confirmed.

8. Upsell / Cross-sell Signals
   No signals identified or pursued. Loan restructuring or a credit health check
   could have been offered — it was not.

10. Summary
    The customer's EMI extension request was resolved within a single call.
    Agent resolution quality was high; a potential upsell moment was missed.

Limits & Notes

Constraint	Value	What to do
Max file duration	1 hour per file	Use `split_audio()` for longer recordings. Concatenate conversation text after parsing.
Max files per request	10 files	For bulk pipelines, group calls into batches of 10 and submit sequentially.
Max payload size	80 MB total	Use FLAC or Opus to reduce file size before upload.
Poll interval	60 seconds minimum	Do not reduce below 60 s. The API rate-limits per `job_id`.
Speaker diarization	Max 2 speakers	Designed for two-party calls (agent + customer). Multi-party calls are not supported.
ITN support	hi-IN, en-IN only	Other languages return verbatim output regardless of the `format` parameter.
LLM token limits	Varies by provider	For calls over 30 minutes, chunk the transcript before sending to the LLM analysis step.

Related docs: Batch STT API reference · REST STT for short clips · SDK: pip install gnani-vachana

Documentation Index

​Overview

​Prerequisites & Installation

​Authentication

​Supported Languages

​Batch API Flow

​Pipeline Implementation

​Imports & Setup

​Submit & Poll

​Parsing — Speaker Transcripts & Sentiment Timeline

​LLM Analysis

​Ad-hoc Q&A

​Summary Report

​Full Pipeline

​Sample Output

​Limits & Notes

Overview

Prerequisites & Installation

Authentication

Supported Languages

Batch API Flow

Pipeline Implementation

Imports & Setup

Submit & Poll

Parsing — Speaker Transcripts & Sentiment Timeline

LLM Analysis

Ad-hoc Q&A

Summary Report

Full Pipeline

Sample Output

Limits & Notes