Section 01
[Introduction] MASTIF: Core Value of a Standardized Evaluation Framework for Multi-Agent Systems
MASTIF is an open-source multi-agent system evaluation framework developed by the Brazilian Web Intelligence Research Group (CEWEB.br). It aims to address the issues of framework fragmentation, scenario singularity, and metric one-sidedness in agent evaluation. It supports mainstream frameworks like CrewAI and LangChain, is compatible with closed-source models such as OpenAI and open-source models like Llama, integrates real-world scenario testing from Mind2Web, and provides developers and researchers with a cross-framework, reproducible, multi-dimensional evaluation system.