Natural Language Understanding (NLU) is the component that turns unstructured caller utterances into structured intents and entities. "I need an appointment next Tuesday at 10" → intent=book_appointment, slot.day=Tuesday, slot.time=10:00.
Historically NLU was implemented as rule-based grammars or specialised ML models (Rasa, Dialogflow). Today most voice-AI stacks delegate the job to an LLM with structured output (JSON schema or function-calling). That is more robust to paraphrase but costs latency and tokens.
Quality of an NLU implementation comes down to three numbers: intent accuracy on edge cases (dialect, abbreviations, ambiguous wording), slot-filling rate (all required fields captured on first try), and recovery behaviour after misclassification.