Zing Forum

Reading

ARMeta: A New Multi-Agent LLM-Based Metamorphic Testing Method for REST APIs

ARMeta leverages a large language model (LLM)-driven multi-agent workflow to automatically generate and execute metamorphic testing scenarios for REST APIs. By describing test relationships in the Given-When-Then format, it effectively addresses the test oracle problem in API testing.

蜕变测试REST API多智能体大语言模型软件测试测试预言OpenAPIAPI测试
Published 2026-05-27 19:24Recent activity 2026-05-28 13:27Estimated read 9 min
ARMeta: A New Multi-Agent LLM-Based Metamorphic Testing Method for REST APIs
1

Section 01

Introduction to ARMeta: A New Multi-Agent LLM-Based Metamorphic Testing Method for REST APIs

Introduction to ARMeta: A New Multi-Agent LLM-Based Metamorphic Testing Method for REST APIs

This article introduces ARMeta—a new method that uses an LLM-driven multi-agent workflow to automatically generate and execute metamorphic testing scenarios for REST APIs. By describing test relationships in the Given-When-Then format, this method effectively solves the test oracle problem in API testing.

Original paper information:

Subsequent floors will sequentially cover the challenges of REST API testing, ARMeta's method architecture, experimental results, technical highlights, application scenarios, limitations & future directions, and conclusions.

2

Section 02

Challenges in REST API Testing and Solutions via Metamorphic Testing

Challenges in REST API Testing and Solutions via Metamorphic Testing

REST APIs are the core of modern software system architectures, but their testing faces the test oracle problem: For complex APIs (e.g., e-commerce order query interfaces), it is often impractical to pre-determine the correct output for every input.

Metamorphic testing bypasses this problem by focusing on relationships between outputs rather than absolute correctness. For example:

  • After extending the time range of an order query, the number of returned results should not decrease;
  • Querying a non-existent user ID should return an empty list or error code;
  • The union of overlapping time range queries should include the results of each individual query.

These relationships are called metamorphic relations, which rely on logical consistency rather than specific output content.

3

Section 03

System Architecture of ARMeta and Advantages of Multi-Agent Design

System Architecture of ARMeta and Advantages of Multi-Agent Design

ARMeta's workflow consists of three phases:

  1. Test Scenario Identification: Analyze OpenAPI documents, perform parameter analysis, state recognition, and relation mining;
  2. Scenario Specification: Convert scenarios into the Given-When-Then format (e.g., Given user A has N orders in T1, When the time range is extended to T2, Then the number of returned orders ≥ N);
  3. Test Generation & Execution: Automatically convert to executable code, execute metamorphic transformations, and verify output relationships.

Advantages of the multi-agent architecture:

  • Task specialization: Different agents are responsible for analysis, specification, code generation, etc.;
  • Error isolation: Errors in a single agent do not affect the overall workflow;
  • Scalability: Flexibly add new agents to handle specific APIs;
  • Quality improvement: Multi-round verification enhances test quality.
4

Section 04

Experimental Evaluation Results of ARMeta

Experimental Evaluation Results of ARMeta

The research team evaluated ARMeta on two public web applications, comparing it with traditional scenario testing baselines:

  • Test coverage: Explored behaviors that traditional methods struggle to cover, such as boundary conditions, state transitions, and exception paths;
  • Complementarity: Complements existing methods and can find defects missed by traditional approaches;
  • Practical effects: Identified multiple API consistency issues, generated high-quality test cases, and supported CI/CD integration.
5

Section 05

Technical Implementation Highlights of ARMeta

Technical Implementation Highlights of ARMeta

  1. OpenAPI Document Parsing: Supports standard OpenAPI documents, extracting endpoint paths, request parameters, response schemas, authentication requirements, and other information;
  2. Agent Collaboration: The analysis agent understands API semantics, the specification agent converts to Given-When-Then format, the implementation agent generates test code, and the verification agent checks correctness;
  3. High Automation: Users only need to provide the OpenAPI document, target API base URL, and optional authentication information to automatically complete test generation and execution.
6

Section 06

Application Scenarios and Value of ARMeta

Application Scenarios and Value of ARMeta

  • API Development Phase: Quickly verify design rationality, find boundary condition handling issues, and ensure behavioral consistency;
  • Regression Testing: Integrate into CI/CD workflows to automatically detect regression defects introduced by changes and verify version consistency;
  • Third-Party API Integration: Verify whether third-party APIs conform to document descriptions, identify implicit constraints, and establish health monitoring mechanisms.
7

Section 07

Limitations of ARMeta and Future Research Directions

Limitations of ARMeta and Future Research Directions

Current Limitations:

  1. Limited coverage of metamorphic relations; complex relation patterns need further exploration;
  2. Test generation for APIs with complex state management remains challenging;
  3. Multi-agent LLM calls incur high computational costs.

Future Directions:

  • Smarter metamorphic relation discovery;
  • Incremental testing to support API version changes;
  • Optimize agent calling strategies to reduce costs;
  • Extend to other API protocols such as GraphQL.
8

Section 08

Innovative Value and Outlook of ARMeta

Innovative Value and Outlook of ARMeta

ARMeta is an innovative attempt to apply LLMs in the field of software testing. Through multi-agent workflows and metamorphic testing, it effectively solves the test oracle problem in REST API testing and automatically generates high-quality tests.

This research demonstrates the application potential of LLMs in software engineering and provides a new path for API test automation. As API-driven architectures continue to develop, such intelligent testing tools will play an important role in ensuring software quality.