# LLM-based Multi-Agent Metamorphic Testing for FMU Simulation Models: A New Automated Verification Solution

> An automated testing framework using large language models (LLMs) and multi-agent collaboration that automatically extracts metamorphic relations from specifications and generates test cases, solving the testing challenge of FMU simulation models lacking explicit expected outputs.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-24T14:30:56.000Z
- 最近活动: 2026-05-26T02:54:19.677Z
- 热度: 112.6
- 关键词: 蜕变测试, FMU仿真, 多智能体, LLM, 自动化测试, FMI, 工业仿真
- 页面链接: https://www.zingnex.cn/en/forum/thread/llmfmu
- Canonical: https://www.zingnex.cn/forum/thread/llmfmu
- Markdown 来源: floors_fallback

---

## LLM-based Multi-Agent Metamorphic Testing for FMU Simulation Models: Core Introduction

This article introduces an innovative solution that uses an LLM-driven multi-agent workflow to automatically generate metamorphic relations from specifications, addressing the testing challenge of FMU (Functional Mock-up Unit) simulation models lacking explicit expected outputs. This solution comes from the paper "Multi-Agent Specification-based Metamorphic Testing of FMU-Based Simulations" published on arXiv on May 24, 2026 (link: http://arxiv.org/abs/2605.25101v1).

## Testing Dilemmas and Background of FMU Simulation Models

### What are FMI and FMU?
FMI (Functional Mock-up Interface) is a widely adopted standard for simulation model exchange in industry, allowing models developed with different tools to be packaged into FMU format for exchange and facilitating cross-organizational collaboration.
### Testing Challenges
1. **Black-box nature**: FMU is a binary file, so white-box testing is ineffective;
2. **Lack of expected outputs**: Complex dynamic systems have no known "correct outputs" as a benchmark;
3. **State space explosion**: Input space is infinite, making exhaustive testing impossible.

## Solution: Metamorphic Testing and LLM Multi-Agent Framework

### Core Idea of Metamorphic Testing
Instead of directly judging output correctness, it checks the reasonable relationships between outputs (metamorphic relations, MRs). For example: sin(-x) = -sin(x). Industrial MRs include scaling, monotonicity, invariance, conservation laws, etc.
### LLM Multi-Agent Framework
Collaborated by 5 types of agents:
1. **Specification Parsing Agent**: Reads specification documents and identifies variables and requirements;
2. **Requirement Extraction Agent**: Identifies MR sources (symmetry, conservation laws, etc.) from specifications;
3. **MR Generation Agent**: Generates formal MRs using the Given-When-Then pattern;
4. **Test Generation Agent**: Converts MRs into executable test cases;
5. **Execution and Verification Agent**: Coordinates FMU simulation, verifies MRs, and generates reports.
### Advantages of the Given-When-Then Pattern
High readability, clear structure, easy to automate, and traceable.

## Case Study: Verification of a Lubricating Oil Cooling System Simulation Model

### Examples of Automatically Generated MRs
- **MR-1**: Load-temperature monotonicity (Under steady state, increasing heat load leads to a monotonic rise in oil temperature);
- **MR-2**: Flow conservation (In a closed loop, the inflow and outflow of the radiator are equal);
- **MR-3**: Cooling efficiency boundary (When ambient temperature is fixed, adjusting fan speed to 100% results in a reasonable drop in oil temperature).
### Experimental Results
- Successfully generated physically reasonable MRs;
- Significantly reduced manual workload;
- Discovered anomalies in model boundary conditions;
- Improved test coverage.

## Technical Advantages and Current Limitations

### Advantages
- **Multi-agent collaboration**: Specialized, interpretable, scalable, and robust;
- **LLM role**: Natural language understanding, domain knowledge reasoning, creative generation, and formal transformation.
### Limitations
- Dependent on the quality of specification documents;
- LLMs may generate hallucinated MRs;
- High computational cost;
- Adaptability to other physical domains needs verification.

## Practical Insights and Recommendations

### For Simulation Model Developers
1. Emphasize the quality of specification documents as the foundation for automated testing;
2. Adopt metamorphic thinking for test design;
3. Human-machine collaboration: Use LLMs to generate initial MR drafts, then refine them with expert review.
### For Test Engineers
1. When "correct outputs" cannot be defined, use metamorphic relations to verify relationships between outputs;
2. Specification-driven testing: Extract test basis from the requirement phase;
3. AI as an assistant rather than a replacement; final decisions depend on human judgment.

## Research Summary and Future Directions

### Summary
This solution combines LLM multi-agent and metamorphic testing, providing a new idea for industrial simulation model verification. It has been proven to reduce manual workload and improve test coverage.
### Future Directions
1. Develop automatic MR quality assessment methods;
2. Introduce active learning to optimize MR generation;
3. Support multi-modal specification documents;
4. Explore real-time test generation in CI/CD pipelines.
