Zing Forum

Reading

MulDimIF: A Multi-Dimensional Constraint Framework for Systematically Enhancing Instruction-Following Capabilities of Large Language Models

MulDimIF is a multi-dimensional constraint framework proposed by Fudan University. It constructs 9106 code-verifiable evaluation samples through three-dimensional constraint patterns, four constraint categories, and a four-level difficulty system. Experiments show that reinforcement learning training using data generated by this framework can significantly enhance the instruction-following capabilities of models, and the performance improvement mainly comes from parameter updates in the attention module.

MulDimIF指令遵循ACL 2026复旦大学大语言模型强化学习GRPO注意力机制评测基准
Published 2026-05-15 19:25Recent activity 2026-05-15 19:31Estimated read 7 min
MulDimIF: A Multi-Dimensional Constraint Framework for Systematically Enhancing Instruction-Following Capabilities of Large Language Models
1

Section 01

【Introduction】MulDimIF: A Multi-Dimensional Constraint Framework for Systematically Enhancing Instruction-Following Capabilities of Large Language Models

Fudan University proposes the MulDimIF multi-dimensional constraint framework, which constructs 9106 code-verifiable evaluation samples through three-dimensional constraint patterns, four constraint categories, and a four-level difficulty system. Reinforcement learning training using data from this framework can significantly enhance the instruction-following capabilities of models, and the performance improvement mainly comes from parameter updates in the attention module. The research results have been accepted by ACL 2026, and a supporting open-source toolchain is available for evaluation and training.

2

Section 02

Research Background and Motivation

The instruction-following capability of large language models is a core practical indicator, but existing research has two major limitations: single evaluation dimension (only focusing on constraint categories, lacking consideration of complexity and conflict relationships); vague improvement path (staying at the evaluation level without effective enhancement solutions). The MulDimIF framework addresses these pain points by providing a refined evaluation system and a complete data generation and training scheme.

3

Section 03

Framework Design: Three-Dimensional and Four-Level Constraint System

The core of MulDimIF is a three-dimensional constraint analysis framework:

  1. Three-dimensional constraint patterns: Single, parallel, nested (revealing the impact of instruction structure on following difficulty);
  2. Four constraint categories: Format, content, logic, numerical;
  3. Four-level difficulty system: Level I (Basic), Level II (Advanced), Level III (Complex, including conflicting constraints), Level IV (Expert, nested logic).
4

Section 04

Data Generation Pipeline and Code Verification Mechanism

Based on the framework, a three-stage generation process is designed: constraint expansion (LLM generates diverse variants) → conflict detection (identifies constraint combinations that cannot be satisfied simultaneously) → instruction rewriting (converts to natural language instructions). 9106 code-verifiable samples are constructed (7906 for training / 1200 for testing). Code verification eliminates human subjective differences and ensures objective and scalable evaluation.

5

Section 05

Experimental Results and Reinforcement Learning Improvements

Evaluation of 18 models (6 families, including open-source and closed-source) found:

  • Obvious difficulty gradient: Level I accuracy 80.82% → Level IV 36.76%;
  • Model family differences: Significant gaps between open-source and closed-source models in complex scenarios;
  • Constraint sensitivity: Format constraints are easy to handle, while nested logic is a common difficulty. Training 6 models with ≤14 billion parameters using the GRPO algorithm resulted in significant performance improvement without impairing general capabilities.
6

Section 06

Parameter-Level Analysis: The Key Role of the Attention Module

Parameter-level analysis shows that parameter updates in the attention module (weights and projection layers) are highly correlated with the improvement of instruction-following capabilities. Mechanism explanation: Enhances constraint recognition ability (focuses on key constraints in instructions) and maintains constraint memory (reduces forgetting). This provides guidance for model architecture design: Optimizing the attention mechanism is a key lever to enhance instruction-following capabilities.

7

Section 07

Open-Source Ecosystem and Application Prospects

Open-source toolchain support: Inference (vLLM high throughput + closed-source API calls), automatic evaluation, RL training process, instruction generation pipeline. Application prospects: Model selection reference, fine-tuning guide, Prompt engineering optimization, domain evaluation benchmark construction.

8

Section 08

Conclusion: Transition from Experience-Driven to Framework-Driven

MulDimIF represents the transition of instruction-following research to framework-driven, providing theoretical and tool foundations. For engineers and researchers, it is a systematic methodology—proving that enhancing instruction-following capabilities can be analyzed, measured, and improved through scientific methods, rather than being a matter of mystery.