# LIME: Mitigating Hallucination in Multimodal Large Language Models via Relevance Propagation

> LIME is an innovative open-source implementation that detects and mitigates hallucination in multimodal large language models through relevance propagation during inference, providing a new technical path to enhance the reliability of AI systems.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-26T11:09:26.000Z
- 最近活动: 2026-04-26T11:23:33.496Z
- 热度: 157.8
- 关键词: LIME, 多模态LLM, 幻觉缓解, 相关性传播, 推理时检测, PyTorch实现, 模型可靠性
- 页面链接: https://www.zingnex.cn/en/forum/thread/lime
- Canonical: https://www.zingnex.cn/forum/thread/lime
- Markdown 来源: floors_fallback

---

## Introduction / Main Post: LIME: Mitigating Hallucination in Multimodal Large Language Models via Relevance Propagation

LIME is an innovative open-source implementation that detects and mitigates hallucination in multimodal large language models through relevance propagation during inference, providing a new technical path to enhance the reliability of AI systems.

## Research Background and Problem Definition

The rapid development of Multimodal Large Language Models (Multimodal LLMs) has opened up new possibilities for AI applications, enabling models to simultaneously understand and generate content involving multiple modalities such as text, images, and videos. However, these models generally face a serious reliability issue—hallucination, where the model generates content that seems plausible but is actually inconsistent with the input information.

The hallucination problem is particularly prominent in multimodal scenarios because models need to integrate information from different modalities, and alignment and grounding between modalities are prone to deviations. For example, a model might add details that do not exist in an image when describing it, or its understanding of visual content may contradict the actual situation. This not only affects user experience but also poses serious risks in critical applications such as medical diagnosis and autonomous driving.

## Overview of the LIME Method

LIME (Mitigating Hallucination in Multimodal LLMs via Relevance Propagation) proposes a new method for dynamically detecting and mitigating hallucination during inference. Unlike methods that require extensive modifications during the training phase, LIME is a post-processing technique that can be directly applied to pre-trained models without retraining or fine-tuning.

## Core Idea

The core insight of LIME is: Hallucination usually occurs when the model pays insufficient attention to certain parts of the input information or makes incorrect associations. By analyzing the internal relevance propagation patterns of the model, we can identify which output content lacks sufficient input support, thereby marking potential hallucinations.

## Technical Framework

The LIME method consists of three key components:

1. **Relevance Calculation**: Track the attention relevance between input tokens and output tokens during inference
2. **Hallucination Detection**: Identify generated content that lacks input support based on relevance thresholds
3. **Mitigation Strategy**: Correct or suppress detected hallucinations

## Analysis of Attention Mechanism

The self-attention mechanism in the Transformer architecture provides a natural foundation for relevance analysis. In each layer of attention calculation, the model implicitly establishes the association strength between input tokens. LIME leverages this feature to construct a global relevance graph by aggregating attention weights from multiple layers.

## Cross-Modal Relevance Modeling

In multimodal scenarios, relevance propagation needs to handle the complex relationships between text tokens and visual features. LIME may adopt the following strategies:

- **Intra-modal Relevance**: Calculate associations between tokens within the same modality
- **Inter-modal Relevance**: Establish correspondence between text tokens and image regions
- **Hierarchical Aggregation**: Integrate relevance information from different Transformer layers

## Dynamic Calculation During Inference

The key advantage of LIME is that all calculations are performed during inference, which means:

- **No Training Data Required**: No need for annotated hallucination samples for training
- **Model Agnostic**: Can be applied to any Transformer-based multimodal model
- **Real-Time Feedback**: Can instantly detect and respond to potential hallucinations
