Zing Forum

Reading

ASEM: Enabling Self-Evolving Memory Systems for Large Language Model Agents

ASEM is a five-stage memory framework that enables LLM agents to maintain a living knowledge network across conversations and achieve true self-evolution capabilities through structured organization, reinforcement learning operations, and value-aware retrieval.

LLM智能体记忆系统强化学习RAG自我进化持续学习
Published 2026-04-04 22:15Recent activity 2026-04-04 22:18Estimated read 6 min
ASEM: Enabling Self-Evolving Memory Systems for Large Language Model Agents
1

Section 01

[Introduction] ASEM: Enabling Self-Evolving Memory Systems for LLM Agents

ASEM (Agentic Self-Evolving Memory) is a five-stage memory framework designed to address the issues of knowledge freezing and inability to continuously learn in Large Language Model (LLM) agents. Through structured memory organization, reinforcement learning-driven management, and value-aware retrieval, ASEM allows agents to maintain a living knowledge network across conversations and achieve self-evolution. Its core innovations include multi-attribute atomic notes, a memory manager trained with GRPO, two-stage hybrid retrieval, etc., providing a new path for the practical deployment of LLM agents and research on continuous learning.

2

Section 02

Background: The Knowledge Freezing Dilemma of LLM Agents

Current LLM agents face the problem of knowledge being frozen in pre-trained parameters, with high fine-tuning costs that are difficult to execute frequently, and a lack of cross-conversation memory and experiential learning mechanisms. ASEM proposes keeping the underlying model frozen and achieving continuous learning and adaptation through an external memory bank and utility estimation to address this fundamental challenge.

3

Section 03

Core Method: ASEM's Five-Stage Memory Framework

ASEM's core is a five-stage memory framework:

  1. Multi-attribute Atomic Notes: Include original content, embedding vectors, keywords, and other metadata, supporting multi-dimensional retrieval;
  2. Reinforcement Learning-Driven Management: A memory manager trained with GRPO optimizes memory writing/organization/update strategies;
  3. Two-Stage Hybrid Retrieval: First recall semantically similar content, then perform value-aware reordering (evaluating the memory's help for the task);
  4. Non-parametric Utility Update: Use EMA to dynamically adjust memory utility, which is lightweight and effective;
  5. Plug-and-Play Inference Backend: Supports HuggingFace and LangChain; training uses HuggingFace exclusively.
4

Section 04

Technical Implementation: Code Structure and Workflow

ASEM codebase core modules:

  • asem/: Core functions such as memory management, retrieval, and utility estimation;
  • training/: GRPO training loop (training for memory manager and response agent);
  • eval/: Evaluation framework and baseline comparison;
  • configs/: Default hyperparameter configurations;
  • data/: Prompt words and benchmark test resources;
  • scripts/: Model download and performance analysis tools. A complete training and evaluation workflow is provided, supporting benchmark testing, result generation, and manual evaluation.
5

Section 05

Practical Significance: Application Value in Enterprise and Research Fields

Enterprise applications: Agents remember user preferences/business rules/interaction history to provide personalized services (e.g., customer service remembering customer issues, programming assistants matching code styles); Research fields: Demonstrates the application of reinforcement learning in memory management, provides new ideas for continuous/lifelong learning, and non-parametric utility updates serve as a reference for lightweight design.

6

Section 06

Limitations and Future Directions

Limitations: Training the memory manager requires large amounts of data and computing resources, and the accuracy of utility estimation depends on task types; Future directions: Integrate advanced RAG technology, explore memory compression and summarization, and expand to multi-modal memory (images, audio, etc.).

7

Section 07

Conclusion: ASEM's Significant Progress

ASEM lays the foundation for the self-evolution of LLM agents through structured memory organization, reinforcement learning management, and value-aware retrieval. As LLM applications deepen, such memory frameworks will become key components in building intelligent and adaptive AI systems.