# su-memory SDK: Building a Local-First AI Memory System with Causal Reasoning Capabilities

> su-memory SDK is a local-first AI memory framework. Using VectorGraphRAG, spacetime indexing, and causal graph technologies, it achieves an 87.8% multi-hop reasoning recall rate and a 96% latency reduction, providing LLM applications with true multi-hop causal reasoning capabilities.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-25T13:09:04.000Z
- 最近活动: 2026-04-25T13:18:00.755Z
- 热度: 154.8
- 关键词: AI记忆系统, VectorGraphRAG, 因果推理, 本地优先, 多跳推理, RAG, 向量数据库, 时序索引, 隐私保护, LangChain
- 页面链接: https://www.zingnex.cn/en/forum/thread/su-memory-sdk-ai
- Canonical: https://www.zingnex.cn/forum/thread/su-memory-sdk-ai
- Markdown 来源: floors_fallback

---

## su-memory SDK Guide: Local-First AI Memory System with Causal Reasoning

su-memory SDK is a local-first AI memory framework that fills the gaps of traditional vector databases in causal reasoning, temporal awareness, and multi-hop association capabilities. Using VectorGraphRAG, spacetime indexing, and causal graph technologies, it achieves an 87.8% multi-hop reasoning recall rate and a 96% latency reduction, providing LLM applications with true multi-hop causal reasoning capabilities.

## Current Status and Challenges of AI Memory Systems

Most current AI applications' memory solutions are based on vector similarity nearest neighbor searches, which can only handle 'find similar' tasks and are ineffective at reasoning questions like 'why' or 'what will happen'. Traditional systems lack the core human memory capabilities of causal reasoning, temporal awareness, and multi-hop association—this is the core problem that su-memory SDK aims to solve.

## Analysis of Core Technical Architecture

su-memory SDK adopts a 'four-in-one' architecture:
1. VectorGraphRAG: Integrates vector retrieval and graph traversal, enabling efficient multi-hop reasoning using HNSW index (m=32, efConstruction=64, efSearch=64) and vector quantization (FP32/FP16/INT8/Binary);
2. SpacetimeIndex: Combines spatial location and temporal encoding, supporting spacetime multi-hop queries;
3. MemoryGraph: Explicitly defines four causal relationships (cause/condition/result/sequence) to enhance interpretability;
4. TemporalSystem: Implements temporal awareness, simulating the time decay characteristic of human memory.

## Performance Data and Engineering Practice

The project's released performance benchmark data:
- Query latency: P50=19ms (96% reduction compared to pre-optimization), P95=76ms;
- Throughput: 94 inserts per second, with ~10.66ms processing time per item;
- Memory usage: 1.53MB for 1000 memories;
- Multi-hop recall rate: 87.8% (46% improvement over baseline).

## Version Strategy and Application Scenarios

su-memory offers two versions: Lite and LitePro:
- Lite: TF-IDF/N-gram retrieval, memory <5MB, suitable for prototype validation;
- LitePro: Integrates Ollama bge-m3, supports full VectorGraphRAG, spacetime indexing, etc., memory <50MB, suitable for production environments.
Application scenarios include long-term dialogue systems, knowledge management tools, predictive applications, and multimodal AI, and it is compatible with LangChain and VMC architectures.

## Business Model and Limitations

Licensing model: Free for individuals (limited to 1000 items), commercial paid (99 yuan/month up to 9999 yuan for private deployment).
Limitations:
- Scale limit: Enterprise version has an upper limit of 100,000 items, not suitable for large-scale document retrieval;
- Ecosystem maturity: Community and toolchain are still under construction;
- Dependency: LitePro requires Ollama to run local models, increasing deployment complexity.

## Summary and Selection Recommendations

su-memory represents the evolution of AI memory systems from storage retrieval to cognitive architecture. Its local-first approach, interpretability, and multimodal capabilities give it an advantage in privacy-sensitive and deep reasoning scenarios. It is recommended for AI application developers who need 'understanding/reasoning' rather than just 'matching/retrieval' to evaluate it. In the future, such local memory systems may become a standard component of next-generation AI applications.