The matsim_llm_plugins project adopts a modular architecture design, with the core goal of establishing a bidirectional interaction channel between MATSim agents and LLMs. The entire system is built around several key components:
ChatManager: The Dialogue Hub for Agents
Each MATSim agent is equipped with an independent ChatManager instance, which is the foundation for realizing persistent memory and multi-turn reasoning. ChatManager maintains a complete dialogue history, is responsible for sending requests to LLMs, and handles multi-step tool execution processes. This design ensures that each simulation agent has an independent "thinking thread" and can make coherent decisions based on historical interactions.
Tool Calling Framework: From Language to Action
The project implements a complete tool calling mechanism, supporting two types of tools:
- LLM Tools: Execution results are returned to the LLM for further reasoning
- Dummy Tools: Execution results are directly consumed by MATSim to trigger simulation state changes
Tool parameters are defined via Java DTOs, which are automatically converted into JSON Schema visible to LLMs, enabling type-safe parameter passing and validation. The system supports parallel tool calls; LLMs can request the execution of multiple tools in a single response, and the system will iteratively execute them until all tools are completed.
Retrieval-Augmented Generation (RAG): Dynamic Context Injection
To enable LLMs to access real-time information from the simulation environment, the project integrates a RAG system based on vector databases:
- Static Context: Road network data, pricing information, infrastructure layout
- Dynamic Context: Agent historical experience, runtime state, environmental changes
Through the Qdrant vector database and LangChain4j framework, the system can retrieve relevant context in milliseconds, providing accurate background information for LLM decision-making.