Exploring and mitigating fawning hallucinations in large language models

  • Zixuan Shangguan
  • , Yanjie Dong*
  • , Lanjun Wang
  • , Xiaoyi Fan
  • , Victor C.M. Leung
  • , Xiping Hu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Large language models (LLMs) have demonstrated exceptional proficiency in language understanding. However, when LLMs align their outputs with deceptive and/or misleading prompts, the generated responses can deviate from the de facto information. Such observations are known as fawning hallucinations, where the model prioritizes alignment with the input's implied perspective over accuracy and truthfulness. In this work, we analyze fawning hallucinations in various natural language processing tasks and tailor the so-called contrastive decoding method for fawning-hallucination mitigation. Specifically, we design two paradigms to generate corresponding deceptive and/or misleading inputs for the consistent induction of fawning hallucinations. Then, we propose the collaborative contrastive decoding (CCD) method to handle the fawning hallucinations across different tasks in LLMs. By contrasting the deviation in output distribution between induced and transformed neutral inputs, the proposed CCD can reduce reliance on deceptive and/or misleading information without requiring additional training. Extensive experiments demonstrate that the proposed CCD can effectively mitigate fawning hallucinations and improve the factuality of the generated responses across various tasks.

Original languageEnglish
Article number132166
JournalNeurocomputing
Volume665
DOIs
Publication statusPublished - 7 Feb 2026
Externally publishedYes

Keywords

  • Contrastive decoding
  • Hallucination
  • Large language models

Fingerprint

Dive into the research topics of 'Exploring and mitigating fawning hallucinations in large language models'. Together they form a unique fingerprint.

Cite this