Section 01
IWC-bench: Introduction to the Standardized Benchmark for Bioinformatics Agents
IWC-bench is a benchmark for evaluating bioinformatics agents derived from peer-reviewed Galaxy workflows in the IWC community, aiming to provide a standardized testing framework for AI applications in bioinformatics. It addresses the problem that existing AI evaluation benchmarks are too simplified and cannot truly reflect the complexity of bioinformatics tasks. By using validated high-quality workflows to construct evaluation tasks, it ensures the authenticity and reproducibility of the evaluation.