Glossary

Turn-Taking

Management of speaker/listener role switches. Beyond raw barge-in: detection of pauses, backchannels ("mhm"), avoidance of double-talk. Key to conversational naturalness.

Turn-taking describes how a voice-AI system decides when to speak and when to listen. Poor turn-taking creates a "walkie-talkie" feel — either the system interrupts constantly or it waits painfully long before replying.

Good turn-taking heuristics combine voice-activity detection (VAD), prosodic end signals ("…alright then."), pause length, and the LLM’s semantic endpoint prediction. Typical target windows are 250–500 ms after a caller pause, with dynamic extension on detectable thinking pauses.

In production, context-dependent profiles pay off: outbound sales can react slightly faster, support for elderly callers slightly slower. Measurable KPIs are share of interrupted caller utterances, mean response latency, and drop-off rate after latency spikes.

Go deeper in the docs

Next step

See BHOMY in a 15-minute demo on a real call example.

🍪

Cookies & Privacy

We use cookies to provide you with the best possible experience on our website. Some of them are technically necessary, others help us improve the website.