Section 01
[Introduction] SequenBench: A New Benchmark for Evaluating Visual Sorting Capabilities of Multimodal Large Language Models
SequenBench is an evaluation benchmark specifically designed to test the visual sorting capabilities of multimodal large language models (MLLMs), containing 6761 images and 7261 multiple-choice questions. This benchmark aims to fill the gap in evaluating MLLMs' visual sorting capabilities, is open-sourced under the Apache-2.0 license, and provides researchers with a unified evaluation standard and tools.