Skip to main content
Glossary

Barge-In

Ability of callers to interrupt the assistant mid-sentence. Considered a marker of natural conversation; implemented via parallel STT with voice-activity detection.

Barge-in is the ability of an AI phone assistant to be interrupted by the caller mid-utterance — just like in a natural conversation. Without it, calls feel robotic because users have to wait for the system to finish speaking.

Technically it relies on continuous speech-to-text with voice-activity detection (VAD): the moment voice input is detected, the active TTS playback is cut and the system switches into listen mode. Thresholds and hysteresis matter — otherwise background noise hijacks the dialogue.

In production, barge-in improves perceived latency and conversion rates. Edge cases worth tuning: loud office noise, hold-music on the caller’s side, and very short confirmations ("yes", "okay") must be classified correctly to avoid ping-pong cuts.

FAQ
When does barge-in fail?
When VAD is too aggressive: coughs, doors or keyboards interrupt the assistant. A conservative energy threshold plus minimum duration (~150–250 ms) usually fixes it.
Related terms

Next step

See BHOMY in a 15-minute demo on a real call example.

🍪

Cookies & Privacy

We use cookies to provide you with the best possible experience on our website. Some of them are technically necessary, others help us improve the website.