Zing Forum

Reading

FlexAC: A Flexible Control Framework for Associative Reasoning in Multimodal Large Language Models

Official implementation of the NeurIPS 2025 paper, a training-free lightweight framework that enables flexible switching between factuality and creativity in multimodal large models via hidden state intervention during inference.

多模态大语言模型关联推理幻觉控制NeurIPS 2025无需训练Qwen-VL可控AI
Published 2026-04-25 16:02Recent activity 2026-04-25 16:19Estimated read 6 min
FlexAC: A Flexible Control Framework for Associative Reasoning in Multimodal Large Language Models
1

Section 01

Introduction / Main Floor: FlexAC: A Flexible Control Framework for Associative Reasoning in Multimodal Large Language Models

Official implementation of the NeurIPS 2025 paper, a training-free lightweight framework that enables flexible switching between factuality and creativity in multimodal large models via hidden state intervention during inference.

2

Section 02

Research Background and Challenges

Multimodal Large Language Models (MLLMs) face a fundamental dilemma in practical applications: the trade-off between factual accuracy and creative expression. Traditional models often struggle to balance these two extremes—overly conservative models may generate accurate but dull responses, while overly free models tend to produce hallucinations and output factually inconsistent content.

The essence of this dilemma lies in the inability to flexibly regulate the model's internal associative reasoning mechanism. Associative reasoning refers to the model's ability to automatically activate relevant concepts and knowledge based on input stimuli; it is the source of creativity but also the root cause of hallucinations. How to precisely control this association strength in different task scenarios has long been a core challenge in the field of multimodal AI.

3

Section 03

Core Ideas of FlexAC

FlexAC (Flexible Associative Control) proposes a revolutionary perspective: treating factuality and creativity as different manifestations of associative reasoning strength. This insight transforms the problem from "how to optimize two objectives separately" to "how to uniformly regulate a continuous spectrum", greatly simplifying the technical path.

The core innovations of the framework are:

  1. No retraining required: Intervention is performed entirely during inference, avoiding expensive model fine-tuning
  2. Lightweight implementation: Behavioral regulation can be achieved only through control vector injection
  3. Bidirectional adjustment: It can both suppress excessive association (reduce hallucinations) and enhance association (improve creativity)
4

Section 04

Technical Implementation Mechanism

The implementation of FlexAC is divided into two main stages:

5

Section 05

Offline Control Vector Construction

In this stage, the research team constructs stable control vectors through the following steps:

  • Hallucination-guided hidden state difference analysis: Identify the activation differences in hidden layers between high-association and normal-association states
  • High-association instance screening: Select representative high-association samples to build a more stable guidance vector
  • Task-specific sample fusion: Optionally integrate a small number of target task samples to improve adaptability

The final generated control vectors include two directions:

  • text_normal_fea.pt: Standard association direction (biased towards factuality)
  • text_creative_fea.pt: High association direction (biased towards creativity)
6

Section 06

Dynamic Control During Inference

During actual inference, FlexAC injects control vectors into the middle layers of Qwen-VL (default layers 15-17) and dynamically adjusts the control strength through an adaptive calibration mechanism:

  • FlexAC-P (Faithfulness-oriented): Set the control factor to -1 to suppress excessive association and reduce hallucinations
  • FlexAC-C (Creativity-oriented): Set the control factor to +1 to enhance association ability and improve creativity

This design allows the same model to seamlessly switch behavior patterns between different tasks.

7

Section 07

Experimental Validation and Datasets

FlexAC has been comprehensively evaluated on multiple authoritative benchmarks:

8

Section 08

Hallucination Detection Benchmarks

  • CHAIR Dataset: Specifically designed to evaluate object hallucinations in image description tasks
  • POPE Benchmark: Systematic test for object existence hallucinations