Zing Forum

Reading

MemTrace: An Open-Source Framework for Tracking Memory System Errors in Large Language Models

MemTrace is an open-source LLM memory system debugging framework developed by the NLP team at Zhejiang University. It converts memory processes into executable memory evolution graphs to enable fine-grained operation-level error attribution and supports automatic prompt optimization to improve task performance.

LLMmemory systemerror tracingdebuggingMemTraceZJUNLPRAGMem0EverMemOSprompt optimization
Published 2026-06-09 17:40Recent activity 2026-06-09 17:48Estimated read 6 min
MemTrace: An Open-Source Framework for Tracking Memory System Errors in Large Language Models
1

Section 01

[Introduction] MemTrace: An Open-Source Debugging Framework for Tracking LLM Memory System Errors

MemTrace is an open-source LLM memory system debugging framework developed by the NLP team at Zhejiang University (ZJUNLP). Its core function is to convert memory processes into executable memory evolution graphs (operation-variable execution graphs) to enable fine-grained operation-level error attribution and support automatic prompt optimization to improve task performance. The framework was open-sourced on June 9, 2026, with the corresponding paper submitted on May 27, 2026. The code repository is on GitHub (https://github.com/zjunlp/MemTrace), and the paper is available at https://arxiv.org/abs/2605.28732.

2

Section 02

Background: Core Pain Points in LLM Memory System Debugging

The LLM memory system is a key component supporting long-range reasoning and multi-turn dialogue, covering solutions like RAG, Mem0, and EverMemOS. However, error localization is challenging: errors may stem from omitted fact extraction, overwritten memory updates, irrelevant retrieval, or generation understanding biases. Traditional logs only present text-level call records and cannot reveal data dependencies and information flow paths between operations. MemTrace aims to solve this problem by converting memory execution processes into traceable structured graphs.

3

Section 03

MemTrace Core Architecture: Operation-Variable Execution Graph

The core innovation of MemTrace is the Operation-Variable Execution Graph:

  • Variables: Represent data entities such as user messages, extracted facts, stored memories, and retrieval results
  • Operations: Represent computational steps like fact extraction, memory update, retrieval, and generation The framework includes four components:
  1. Smartcomment tracking layer (non-intrusively records execution graphs)
  2. MemTraceBench benchmark dataset (covers labeled failure cases for four types of memory systems)
  3. Graph-level automatic attribution algorithm (locates faulty operations and error types)
  4. Diagnostic report and automatic optimization module (outputs suggestions and optimizes prompts)
4

Section 04

Error Attribution Mechanism: Counterfactual Analysis for Root Cause Localization

MemTrace adopts an iterative subgraph tracing strategy: it reversely traverses the execution graph from the output node and uses counterfactual analysis to evaluate the impact of operations on the final error (whether changing the operation output can correct the answer), distinguishing root cause operations from downstream propagation errors. Research shows that memory system failures are mostly systemic issues, such as information loss, retrieval misalignment, and update conflicts.

5

Section 05

Automatic Optimization: From Attribution to Performance Improvement

MemTrace uses attribution signals to targetedly optimize prompts (e.g., enhancing fact extraction guidance, improving retrieval relevance judgment) to form a closed-loop optimization mechanism. Experimental results show that this mechanism can significantly improve end-to-end task performance by up to 7.62% without manual intervention.

6

Section 06

Quick Start and Ecosystem Integration

MemTrace supports installation via pip/uv and requires Python ≥3.12; it has built-in MemTraceBench dataset loading and AgentScope Studio visualization interface; it provides ready-to-use integration with MemBase—MemBase users can track the memory lifecycle via smartcomment and generate execution graph data.

7

Section 07

Conclusion and Outlook: Promoting Memory System Controllability

MemTrace is an important advancement in the field of LLM memory system observability. It converts black-box memory processes into white-box execution graphs, providing developers with debugging and optimization capabilities. As LLMs become popular in scenarios like long conversations and personalized assistants, the reliability of memory systems becomes increasingly critical. MemTrace is expected to become infrastructure in this field, pushing memory systems from 'usable' to 'controllable'.