# MASPrism: A Lightweight Fault Attribution Framework for Multi-Agent Systems Based on Pre-Fill Phase Signals

> MASPrism is an innovative lightweight fault attribution framework that leverages pre-fill phase signals from small language models (SLMs) to identify faulty steps in multi-agent systems. By extracting token-level negative log-likelihood (NLL) and attention weights, this method can locate fault sources without decoding, achieving significant performance improvements on the Who&When and TRAIL benchmarks while increasing processing speed by 6.69 times.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-08T09:40:53.000Z
- 最近活动: 2026-05-11T04:48:03.068Z
- 热度: 96.9
- 关键词: 多智能体系统, 故障归因, 预填充阶段, 小型语言模型, LLM诊断, 注意力机制, 负对数似然, MASPrism, 智能体监控, 可观测性
- 页面链接: https://www.zingnex.cn/en/forum/thread/masprism
- Canonical: https://www.zingnex.cn/forum/thread/masprism
- Markdown 来源: floors_fallback

---

## MASPrism Framework Guide: Innovative Breakthrough in Lightweight Multi-Agent Fault Attribution

MASPrism is a lightweight fault attribution framework based on pre-fill phase signals. It uses internal signals (token-level negative log-likelihood and attention weights) from small language models (SLMs) to identify faulty steps in multi-agent systems, and can locate fault sources without decoding. This framework achieves significant performance improvements on the Who&When and TRAIL benchmarks while increasing processing speed by 6.69 times, providing an efficient solution for multi-agent system fault diagnosis.

## Research Background and Challenges

With the application of LLMs in complex tasks, multi-agent systems have become a core paradigm, but fault localization faces multiple challenges: a single execution involves a large number of agent actions and tool calls; fault evidence is lagging; traditional methods rely on expensive replay, backtracking, or synthetic log training, making real-time diagnosis impractical. Developers urgently need lightweight, low-overhead fault localization solutions.

## Overview of the MASPrism Framework

The core idea of MASPrism is to use internal signals from the pre-fill phase of SLMs to achieve fault localization without generating output tokens. It uses Qwen3-0.6B as the base SLM, with an average processing time of 2.66 seconds, achieving a 6.69x speedup while maintaining high efficiency and diagnostic accuracy.

## Core Technical Mechanisms

### Dual-Phase Pre-Fill Strategy
The first phase extracts token-level negative log-likelihood (NLL) and attention weights, reflecting the model's perplexity and focus; the second phase constructs focused diagnostic prompts to rank candidate fault sources.
### Lightweight Signal Extraction
It leverages the attention matrices and probability distributions already completed in the pre-fill phase, with no additional computational overhead. It focuses on NLL (prediction confidence) and attention weights (context reference), where abnormal signals correspond to faulty steps.

## Experimental Evaluation and Performance

### Benchmark Setup
Tests were conducted on the Who&When-HC (locating the wrong speaker and timing) and TRAIL (tool usage scenarios) benchmarks. The baselines included prompt engineering, supervised learning methods, and commercial models like Gemini-2.5-Pro.
### Performance Metrics
The Top1 accuracy on Who&When-HC improved by 33.41%; on the TRAIL benchmark, it outperformed Gemini-2.5-Pro by 89.50%.
### Efficiency Advantages
The average processing time is 2.66 seconds, a 6.69x speedup, with zero output tokens, making it suitable for resource-constrained environments.

## Practical Application Scenarios

- **Production Environment Monitoring**: Real-time monitoring of multi-agent processes, generating diagnostic reports instantly to reduce fault troubleshooting time.
- **Development and Debugging Assistance**: Quickly locate failed steps to optimize prompt design or system architecture.
- **Automated Quality Assurance**: Integrate into CI/CD pipelines to provide fault attribution analysis for failed test cases.

## Technical Limitations and Future Directions

### Limitations
- Mainly applicable to text-based multi-agent systems; multi-modal scenarios require adjustment of signal extraction strategies;
- The interpretability of pre-fill signals needs to be improved.
### Future Directions
- Adapt to multi-modal interaction scenarios;
- Optimize signal interpretability and convert it into intuitive diagnostic recommendations;
- Combine active learning to optimize strategies using real failure cases.

## Conclusions and Insights

MASPrism is an important breakthrough in the field of multi-agent fault diagnosis, proving that using internal model signals can achieve efficient and low-overhead fault attribution. It provides new ideas for LLM application observability, improving system maintainability and reliability without retraining or architecture modification, which is of great significance for large-scale deployment of multi-agent systems.
