Zing Forum

Reading

TRACER: Automatically Explore and Test Conversational AI Systems Using Large Language Models

TRACER is an automated tool based on large language models that can intelligently explore the functional boundaries of chatbots, generate user personas, and create complete test suites.

对话式AI聊天机器人测试大语言模型自动化测试LangGraph功能探索用户画像生成
Published 2026-05-22 16:42Recent activity 2026-05-22 16:50Estimated read 5 min
TRACER: Automatically Explore and Test Conversational AI Systems Using Large Language Models
1

Section 01

TRACER: Guide to the Large Language Model-Based Automated Testing Tool for Conversational AI

TRACER is an open-source Python tool based on large language models and the LangGraph architecture. It can automatically explore the functional boundaries of conversational AI systems, generate user personas, and create complete test suites. It addresses the pain points of traditional manual testing—being time-consuming, labor-intensive, and incomplete in coverage—by implementing an intelligent testing solution of "testing AI with AI".

2

Section 02

Background and Challenges of Conversational AI Testing

With the widespread application of conversational AI across various industries, ensuring its stable and accurate response to requirements has become a key challenge for developers. Traditional testing relies on manually writing test cases, which is inefficient and difficult to cover all interaction scenarios. TRACER (Task Recognition and Chatbot ExploreR) was thus born, leveraging the understanding capabilities of large language models to enable automated exploration and testing.

3

Section 03

Core Technical Principles and Implementation Details of TRACER

Multi-stage Exploration Process

  1. Session Preparation: Send confusing messages to detect language settings and fallback mechanisms;
  2. Exploration Session: Multiple parallel dialogues, automatically restate questions or switch topics, extract functional points;
  3. Type Determination: Distinguish between transactional (execute operations) and informational (provide information) bots;
  4. Functional Analysis: Adopt different strategies based on type (for transactional bots, find dependency relationships; for informational bots, identify independent topics);
  5. User Persona Generation: Output YAML-format personas for automated testing.

Visualization and Implementation

Generate Graphviz visual workflow diagrams; Easy installation (pip install chatbot-tracer), depends on Graphviz, supports multiple parameter configurations (number of sessions, rounds, model providers, etc.).

4

Section 04

Practical Application Scenarios of TRACER

Transactional Bot Testing

Take a pizza ordering bot as an example, identify the complete process: view menu → select pizza → order drinks → confirm order, capture parameters and dependency relationships.

Informational Bot Testing

Such as Ada-UAM, identify independent topics like contact information, business hours, ticketing processes, and create functional nodes.

5

Section 05

Project Significance and Value of TRACER

Fill the gap in conversational AI testing automation, improve testing efficiency and coverage; The generated user personas can be directly imported into testing frameworks, enabling seamless connection from exploration to testing; Visual workflow diagrams help understand interaction structures and discover experience issues.

6

Section 06

Summary and Outlook

TRACER represents a new direction of "testing AI with AI", suitable for testing chatbots and other complex AI systems. As large language models continue to improve, such intelligent testing tools are expected to become standard configurations in AI development, helping to build more reliable conversational systems.