Zing Forum

Reading

SkyScraper: A Multi-Agent Feedback System for Automatic Detection of News Events from Satellite Imagery

This article introduces the SkyScraper system, which uses an iterative multi-agent workflow to geocode news articles and match them with satellite image sequences. It successfully detects 5 times more events than traditional methods and constructs a multi-temporal remote sensing dataset containing 5000 sequences.

遥感图像多智能体系统地理编码卫星影像变化检测SkyScraper多时相数据新闻事件检测LLM应用地球观测
Published 2026-04-14 22:12Recent activity 2026-04-15 11:20Estimated read 5 min
SkyScraper: A Multi-Agent Feedback System for Automatic Detection of News Events from Satellite Imagery
1

Section 01

Core Introduction to the SkyScraper System: Multi-Agent Feedback Enables Automatic Detection of News Events from Satellite Imagery

This article introduces the SkyScraper system, which uses an iterative multi-agent workflow to geocode news articles and match them with satellite image sequences, addressing the scarcity of multi-temporal remote sensing event description datasets. It successfully detects 5 times more events than traditional methods and constructs a multi-temporal remote sensing dataset containing 5000 sequences.

2

Section 02

Data Dilemmas in Remote Sensing Image Analysis and Limitations of Traditional Methods

Satellite remote sensing image changes emerge gradually, but multi-temporal event description datasets (≥2 images) are scarce due to time-consuming search and annotation. Traditional methods rely on manual annotation or rule-based processing, only handling bi-temporal image pairs and focusing on land use and land cover changes; recent LLM methods still depend on pre-labeled datasets, with multi-temporal descriptions limited to the drone video domain.

3

Section 03

Five-Step Iterative Workflow and Feedback Mechanism of the SkyScraper System

SkyScraper is an iterative multi-agent workflow with five steps: 1. Extraction (LLM extracts geographic entities and timelines); 2. Geocoding (Mapbox API converts coordinates); 3. Image Acquisition (PlanetScope imagery); 4. Verification (multi-modal LLM cross-validates event visibility); 5. Description (generates change descriptions). The key innovation is iterative feedback: when geocoding or verification fails, it uses failure information to request new candidate locations and optimize the search.

4

Section 04

Experimental Validation: SkyScraper's Event Detection Performance Improves by 5 Times

The research team used 1000 news articles and compared the weighted centroid, GIPSY, and SkyScraper methods. The results show that SkyScraper detected nearly 5 times more events than traditional methods. Reasons for improvement: agent verification eliminates false positives, iterative learning optimizes search, and multi-modal fusion enhances accuracy.

5

Section 05

SkyScraper Constructs a Dataset of 5000 Multi-Temporal Sequences

Applying SkyScraper to 2022-2024 GDELT news articles and using PlanetScope imagery to build a multi-temporal description dataset, the team obtained the SkyScraper GDELT dataset containing approximately 5000 sequences after annotator verification. They also generated a Sentinel-2 version, demonstrating large-scale data curation capabilities.

6

Section 06

Application Value and Future Outlook of SkyScraper

Application value: Supports journalism (provides visual evidence), disaster response (detects disaster impacts), urban planning and environmental monitoring (tracks changes). Future outlook: With the development of satellite technology and AI agents, it will play a greater role in earth observation and other fields.

7

Section 07

Technical Insights, Limitations, and Conclusion

Technical insights: Agent feedback is superior to single-round reasoning, multi-modal verification improves reliability, and modular design facilitates expansion. Limitations: Dependence on news quality, satellite imagery availability, high computational cost, and possible errors in verification agents. Conclusion: SkyScraper is an important advancement in remote sensing analysis and provides a practical case for AI directions such as multi-agent collaboration.