Hallucination
Short Definition
Full Definition
Hallucination is one of the most critical challenges in modern AI, particularly for large language models and generative AI systems. The term describes the phenomenon where an AI model produces output that appears confident and coherent but is factually wrong, fabricated, or not supported by its training data or provided context. Unlike human errors, AI hallucinations can be particularly dangerous because they are often presented with the same confidence as accurate information, making them difficult for users to detect without independent verification. Hallucinations occur for several reasons. Language models are trained to predict plausible next tokens rather than to verify factual accuracy. They have no grounded understanding of truth — they learn statistical patterns in text. When asked about topics where training data is sparse, contradictory, or absent, models fill in gaps with plausible-sounding but incorrect information. Hallucinations can manifest as completely fabricated facts, incorrect attributions, made-up citations, nonexistent events, and false statistics. The problem is particularly concerning in high-stakes domains like healthcare, law, and finance, where incorrect information can have serious consequences. Significant research effort is being directed at reducing hallucinations through techniques including retrieval-augmented generation (grounding outputs in verified sources), improved training objectives, RLHF alignment, output verification systems, and confidence calibration. Despite progress, hallucination remains an unsolved problem and is a primary focus of AI safety research.
Technical Explanation
Hallucination in LLMs arises from the training objective of next-token prediction: P(token_n | token_1, …, token_n-1). The model maximizes likelihood of training data without explicit truth grounding. Types include intrinsic hallucination (contradicts the source/prompt) and extrinsic hallucination (cannot be verified from the source). Mitigation strategies include Retrieval-Augmented Generation (RAG) which grounds responses in retrieved documents, constrained decoding that restricts output to verified information, self-consistency checking (sampling multiple responses and selecting consensus), and chain-of-thought verification where the model checks its own reasoning. RLHF training penalizes confident incorrect outputs. Factual grounding scores and citation verification systems provide post-generation checking. Temperature reduction decreases randomness but does not eliminate hallucination. Calibration training aims to align model confidence with actual accuracy.
Use Cases
Advantages
Disadvantages
Schema Type
Featured Snippet Candidate
Difficulty Level