Overview
Advanced ASR (Automatic Speech Recognition) settings control how your AI voice agent listens and responds to spoken input.They define listening limits, silence detection, interruption behavior, and background noise filtering for optimal call experience.
Location in Platform
Manage Agent → Customization → Advanced ASR SettingsNote: These settings apply only to voice channels. They do not affect chat or text-based agents.
Available Settings
Setting | Description |
---|---|
Max Speech Duration | Maximum duration (in seconds) the agent will listen to a single user input |
Initial Silence Timeout | Time to wait for the user to start speaking before cancelling |
End Silence Timeout | Time to wait after user stops speaking before finalizing the input |
Speech Segmentation Silence Timeout | Mid-speech silence duration used to split long speech into segments |
Allow Interruptions | Lets the user speak over the agent; the agent stops speaking and listens immediately |
Interrupt Initial Message | Allows interruptions during the agent’s very first message in a call or interaction |
Background Noise Filtering | Adjusts sensitivity to background sounds to reduce false interruptions |
How Each Setting Works
1. Max Speech Duration
Description: Specifies how long the agent will listen to user input in one stretch before automatically stopping. Use Case: Prevents prolonged listening due to background noise or over-talking. Ensures the agent maintains a responsive, controlled interaction.2. Initial Silence Timeout
Description: Defines how long the agent will wait for the user to begin speaking at the start of a turn. If the user remains silent beyond this threshold, input is cancelled or retried. Use Case: Useful when users are unsure, distracted, or take time to process the prompt. Prevents the system from hanging indefinitely.3. End Silence Timeout
Description: Sets the duration of silence after the user stops speaking, after which the agent considers the input complete and proceeds. Use Case: Allows for natural pauses while still keeping the interaction smooth. Essential for avoiding premature cutoff.4. Speech Segmentation Silence Timeout
Description: When users speak in long sentences or paragraphs, this setting helps break the input into segments based on silence detection. Particularly useful for streaming ASR or multi-sentence inputs. Use Case: Improves comprehension and reduces memory load on the model by processing inputs in manageable chunks.5. Allow Interruptions
Description: When enabled, the agent will stop speaking and immediately listen when the user starts talking.Use Case: Creates a more natural, back-and-forth conversation where users can cut in.
6. Interrupt Initial Message
Description: When enabled (and Allow Interruptions is ON), the agent will also allow interruptions during its very first message.Use Case: Lets impatient users respond immediately, even during the greeting.
7. Background Noise Filtering
Description: Controls how sensitive the interruption feature is to background sounds.Low sensitivity (closer to 20): More likely to trigger on quiet sounds, including unwanted noise.
High sensitivity (closer to 100): Better at ignoring noise like traffic or barking but may miss very soft speech.
Use Case: Reduce false triggers while balancing responsiveness.
Recommendations
- Use shorter timeouts for transactional bots (e.g., booking, verification).
- Use longer timeouts for support scenarios, complex discussions, or with elderly users.
- Enable Segmentation when expecting detailed or multi-part answers.
- Enable Allow Interruptions for more natural, human-like interactions.
- Adjust Background Noise Filtering based on the expected environment.