# Thinking Agents: A Goal-Oriented Multi-Agent System Based on Graph Networks and Active Inference

> An agent platform integrating RAG, graph neural networks, and active inference, enabling goal-oriented autonomous planning and experience reuse through decision graphs

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-03-30T05:46:47.000Z
- 最近活动: 2026-03-30T05:56:20.744Z
- 热度: 163.8
- 关键词: 多智能体系统, RAG, 图神经网络, 主动推理, 目标导向, LLM评判, 决策图谱, 经验复用, React, Flask
- 页面链接: https://www.zingnex.cn/en/forum/thread/thinking-agents
- Canonical: https://www.zingnex.cn/forum/thread/thinking-agents
- Markdown 来源: floors_fallback

---

## Thinking Agents Project Overview: Core Innovations of a Goal-Oriented Multi-Agent System

# Thinking Agents: A Goal-Oriented Multi-Agent System Based on Graph Networks and Active Inference

Developed by Marcus Anderson, this project integrates Retrieval-Augmented Generation (RAG), graph neural networks, and active inference theory to build a goal-oriented multi-agent platform. The core innovation lies in the **decision graph** mechanism, which records agents' decision paths to form reusable knowledge assets. It addresses the limitations of traditional LLM agents—lack of goal orientation, planning ability, and experience accumulation—enabling autonomous planning and cross-task experience reuse.

Project link: https://github.com/maracman/thinking-agents

## Dilemmas of LLM Agents: Challenges from Dialogue to Action

## Dilemmas of LLM Agents: Challenges from Dialogue to Action

Traditional LLM-driven agents excel at dialogue but have fundamental limitations:
- Lack of true goal orientation, making it hard to complete specific tasks;
- Lack of systematic planning ability, easily getting lost in complex problems;
- Unable to accumulate experience from failures, repeating mistakes.

The core issue is the absence of effective mechanisms to manage goals, evaluate progress, and adjust strategies. In contrast, humans naturally decompose tasks, set subgoals, and backtrack to adjust—this is a key challenge for AI to achieve autonomy.

## Innovative Architecture: Hierarchical Intelligence and Decision Loop

## Innovative Architecture of Thinking Agents

The system adopts a layered intelligence design with core mechanisms including:
1. **Goal-oriented agent loop**: The cognitive core, with the process: check decision graph → generate subgoals → execute actions → LLM evaluate progress;
2. **LLM evaluation mechanism**: A meta-cognitive component that scores actions (1-7 points) via lightweight LLM, triggering Go/NoGo decisions to avoid self-assessment bias;
3. **Graph intelligence engine**: The foundation for memory and learning, encoding Go/NoGo decisions as graph edges (weights reflect attempt counts), and enabling experience reuse through semantic embedding and shortest path algorithms.

## Decision Graph: Experience Reuse and Optimal Path Search

## Decision Graph: From Experience to Intelligence

The decision graph is the system's most distinctive design. Unlike traditional RAG that retrieves document fragments, it retrieves **successful decision paths**:
- **Semantic node embedding**: Uses the `all-MiniLM-L6-v2` model to embed node labels into a semantic space, searching for relevant historical nodes via cosine similarity;
- **Weighted path search**: Uses NetworkX shortest path algorithm, combining historical attempt counts (`persistence_count`) and failure penalties (NoGo edge weight ×10) to find optimal paths;
- **Graph fusion and transfer**: Supports cross-agent graph import and merging to form a shared knowledge base;
- **Similarity links**: Automatically detects semantically similar nodes (cosine similarity >0.8) after merging and establishes low-cost connections.

## Cognitive Science Foundation: Application of Active Inference Theory

## Cognitive Science Foundation: Active Inference Theory Application

The system's design is rooted in Karl Friston's **active inference** theory (agents minimize prediction errors through actions):
- **Predictive goal setting**: Proactively predicts paths to achieve goals, driving subgoal generation and path selection;
- **Error-driven adaptation**: When LLM scores are below expectations, it is treated as a prediction error, triggering strategy adjustments (NoGo decisions);
- **Free energy minimization**: Prioritizes paths with high historical success rates (low-weight edges) to reduce cognitive effort.

## Technical Implementation Highlights: Full-Stack AI Application Architecture

## Technical Implementation Highlights

The system's tech stack reflects modern full-stack AI application features:
- **Multi-provider LLM support**: Abstracts interfaces for OpenAI/Anthropic/Cohere/HuggingFace/local GGUF models, supporting automatic fallback and retry;
- **Front-end and back-end separation**: Front-end uses React 17 + Webpack 5, back-end uses Flask + Waitress to provide WSGI services;
- **Interactive visualization**: Generates HTML graphs via PyVis, embedded in the front-end via iframe to intuitively display decision processes;
- **Persistence management**: Session save/load/copy/delete, decision graphs stored in JSON format to ensure knowledge continuity.

## Application Scenarios and Value: From Complex Problem-Solving to Knowledge Inheritance

## Application Scenarios and Value

The system applies to various complex scenarios:
- **Complex problem solving**: Avoids repeated mistakes via decision graphs and accumulates experience in solving specific problems;
- **Multi-agent collaboration**: Multiple agents explore different paths, merging graphs to form a comprehensive knowledge base;
- **Knowledge inheritance**: New agents import experience graphs to quickly gain domain capabilities;
- **Dialogue games and narrative AI**: Encodes plot paths to enhance NPC behavior coherence.

## Limitations and Future Improvement Directions

## Limitations and Future Directions

The current system has the following limitations and improvement directions:
1. **Subjective evaluation mechanism**: LLM-based scoring may be inconsistent; need to introduce objective metrics or multi-judge consensus;
2. **Insufficient graph semantic expression**: Sentence embedding cannot capture complex contexts; more advanced GNN architectures can be adopted;
3. **Slow learning speed**: Dependent on actual interaction次数; can combine simulation environments for fast self-play learning.