Section 01
[Introduction] LithoBench Benchmark: Evaluating Multimodal Large Models' Remote Sensing Petrology Interpretation Capabilities
This article introduces the LithoBench benchmark, which assesses the geological semantic understanding capabilities of large vision-language models in remote sensing petrology interpretation tasks. The benchmark includes 10,000 expert-annotated samples covering five cognitive levels, and experiments reveal that existing models have significant limitations in high-order reasoning tasks.