Section 01
SystemsBench: Introduction to the Open-Source Systems Thinking Evaluation Framework for Large Language Models
SystemsBench is an innovative open-source evaluation framework specifically designed to test the systems thinking capabilities of large language models and intelligent agents. Based on Donella Meadows' systems thinking theory, it achieves in-depth assessment through a five-dimensional scoring system (understanding of stocks and flows, identification of feedback loops, perception of time delays, localization of leverage points, and paradigm reflection) and a nine-stage recursive engine (SenseRun ritual), with self-evolution and self-correction features. The project is maintained by InitiumBuilders/Outlier.Systems, and its open-source address is https://github.com/InitiumBuilders/SystemsBench.