# ChartAct: A New Benchmark for Dynamic Chart Understanding

> Existing chart understanding benchmarks focus on static charts, but real-world charts are often dynamically interactive. The ChartAct benchmark reveals that even the strongest model Claude-Opus-4.7 has a success rate of only 84.5% on dynamic chart tasks.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-26T13:15:21.000Z
- 最近活动: 2026-05-27T02:30:17.394Z
- 热度: 126.8
- 关键词: 图表理解, 多模态模型, 动态交互, 数据可视化, GUI agent, 状态跟踪, 多步骤推理
- 页面链接: https://www.zingnex.cn/en/forum/thread/chartact
- Canonical: https://www.zingnex.cn/forum/thread/chartact
- Markdown 来源: floors_fallback

---

## ChartAct Benchmark: New Challenges and Current Status of Dynamic Chart Understanding

Most existing chart understanding benchmarks focus on static charts, but real-world charts are often in dynamically interactive forms. As a new benchmark, ChartAct reveals the limitations of current AI models in dynamic chart tasks: even the strongest model Claude-Opus-4.7 has a success rate of only 84.5%. This benchmark provides an important tool for evaluating dynamic chart understanding capabilities.

## The Gap Between Static and Dynamic Chart Understanding

Multimodal AI has made significant progress in static chart understanding, but real-world charts often contain dynamic interactive elements such as hover tips, click-to-expand, and drag-to-adjust. Current AI systems do not perform well on such dynamic tasks, and there is an obvious gap between static and dynamic understanding.

## Design and Core Evaluation Capabilities of the ChartAct Benchmark

ChartAct collected 673 dynamic charts (7 types) from 8 real-world websites and built 1440 Q&A samples. It evaluates in two environments: dynamic charts and dashboards. Core capabilities include: 1. Visible content recognition; 2. Interaction selection (hover/click, etc.); 3. State reasoning (tracking the state after multi-step interactions).

## Experimental Results: Top Models Still Have Significant Limitations

Evaluation of 11 models found: Claude-Opus-4.7 has a success rate of 84.5%, while most models are below 60%; they perform well on simple interactions, but the success rate of multi-step tasks (such as filtering + comparison) is often below 30%; performance in the dashboard environment (multi-chart linkage) is significantly worse.

## Failure Cases: Main Bottlenecks of Dynamic Understanding

Failure cases are attributed to three types of problems: 1. Incorrect interaction selection (selecting non-clickable elements or performing invalid operations); 2. State tracking failure (failing to update cognition after interaction); 3. Premature termination (answering without obtaining complete information).

## Implications and Future Directions of ChartAct

Dynamic chart understanding requires specialized training in interaction selection, state tracking, and multi-step planning; AI needs to improve GUI understanding (including multi-chart linkage); application scenarios include business intelligence, scientific research, news, etc. Current technology requires human supervision, and future efforts need to break through the bottleneck of multi-step reasoning.
