Zing Forum

Reading

MASPrism: A Lightweight Fault Attribution Framework for Multi-Agent Systems Based on Pre-Fill Phase Signals

MASPrism is an innovative lightweight fault attribution framework that leverages pre-fill phase signals from small language models (SLMs) to identify faulty steps in multi-agent systems. By extracting token-level negative log-likelihood (NLL) and attention weights, this method can locate fault sources without decoding, achieving significant performance improvements on the Who&When and TRAIL benchmarks while increasing processing speed by 6.69 times.

多智能体系统故障归因预填充阶段小型语言模型LLM诊断注意力机制负对数似然MASPrism智能体监控可观测性
Published 2026-05-08 17:40Recent activity 2026-05-11 12:48Estimated read 6 min
MASPrism: A Lightweight Fault Attribution Framework for Multi-Agent Systems Based on Pre-Fill Phase Signals
1

Section 01

MASPrism Framework Guide: Innovative Breakthrough in Lightweight Multi-Agent Fault Attribution

MASPrism is a lightweight fault attribution framework based on pre-fill phase signals. It uses internal signals (token-level negative log-likelihood and attention weights) from small language models (SLMs) to identify faulty steps in multi-agent systems, and can locate fault sources without decoding. This framework achieves significant performance improvements on the Who&When and TRAIL benchmarks while increasing processing speed by 6.69 times, providing an efficient solution for multi-agent system fault diagnosis.

2

Section 02

Research Background and Challenges

With the application of LLMs in complex tasks, multi-agent systems have become a core paradigm, but fault localization faces multiple challenges: a single execution involves a large number of agent actions and tool calls; fault evidence is lagging; traditional methods rely on expensive replay, backtracking, or synthetic log training, making real-time diagnosis impractical. Developers urgently need lightweight, low-overhead fault localization solutions.

3

Section 03

Overview of the MASPrism Framework

The core idea of MASPrism is to use internal signals from the pre-fill phase of SLMs to achieve fault localization without generating output tokens. It uses Qwen3-0.6B as the base SLM, with an average processing time of 2.66 seconds, achieving a 6.69x speedup while maintaining high efficiency and diagnostic accuracy.

4

Section 04

Core Technical Mechanisms

Dual-Phase Pre-Fill Strategy

The first phase extracts token-level negative log-likelihood (NLL) and attention weights, reflecting the model's perplexity and focus; the second phase constructs focused diagnostic prompts to rank candidate fault sources.

Lightweight Signal Extraction

It leverages the attention matrices and probability distributions already completed in the pre-fill phase, with no additional computational overhead. It focuses on NLL (prediction confidence) and attention weights (context reference), where abnormal signals correspond to faulty steps.

5

Section 05

Experimental Evaluation and Performance

Benchmark Setup

Tests were conducted on the Who&When-HC (locating the wrong speaker and timing) and TRAIL (tool usage scenarios) benchmarks. The baselines included prompt engineering, supervised learning methods, and commercial models like Gemini-2.5-Pro.

Performance Metrics

The Top1 accuracy on Who&When-HC improved by 33.41%; on the TRAIL benchmark, it outperformed Gemini-2.5-Pro by 89.50%.

Efficiency Advantages

The average processing time is 2.66 seconds, a 6.69x speedup, with zero output tokens, making it suitable for resource-constrained environments.

6

Section 06

Practical Application Scenarios

  • Production Environment Monitoring: Real-time monitoring of multi-agent processes, generating diagnostic reports instantly to reduce fault troubleshooting time.
  • Development and Debugging Assistance: Quickly locate failed steps to optimize prompt design or system architecture.
  • Automated Quality Assurance: Integrate into CI/CD pipelines to provide fault attribution analysis for failed test cases.
7

Section 07

Technical Limitations and Future Directions

Limitations

  • Mainly applicable to text-based multi-agent systems; multi-modal scenarios require adjustment of signal extraction strategies;
  • The interpretability of pre-fill signals needs to be improved.

Future Directions

  • Adapt to multi-modal interaction scenarios;
  • Optimize signal interpretability and convert it into intuitive diagnostic recommendations;
  • Combine active learning to optimize strategies using real failure cases.
8

Section 08

Conclusions and Insights

MASPrism is an important breakthrough in the field of multi-agent fault diagnosis, proving that using internal model signals can achieve efficient and low-overhead fault attribution. It provides new ideas for LLM application observability, improving system maintainability and reliability without retraining or architecture modification, which is of great significance for large-scale deployment of multi-agent systems.