Zing Forum

Reading

GAIE: Real-Time Geomagnetic Storm Prediction Engine Based on NASA Satellite Data

A geomagnetic storm prediction system using the XGBoost model and SHAP interpretability technology, combined with NASA/NOAA real-time satellite data, achieving 97% R² accuracy and 98% F1 classification score.

地磁风暴空间天气机器学习XGBoostNASANOAASHAP可解释AI时间序列预测卫星数据
Published 2026-06-09 08:04Recent activity 2026-06-09 08:21Estimated read 7 min
GAIE: Real-Time Geomagnetic Storm Prediction Engine Based on NASA Satellite Data
1

Section 01

GAIE Engine Core Introduction: Real-Time Geomagnetic Storm Prediction System Based on NASA/NOAA Data

GAIE (Geomagnetic AI Engine) is a core component of the HELIOS Space Intelligence Platform. It uses the XGBoost model and SHAP interpretability technology, combined with NASA/NOAA real-time satellite data, to achieve 97% R² accuracy (KP index prediction) and 98% F1 classification score (G grade prediction). Its goal is to shift space weather monitoring from passive response to predictive defense, protecting critical infrastructure from geomagnetic storms.

2

Section 02

Project Background: Why Do We Need to Predict Geomagnetic Storms?

Geomagnetic storms are caused by the interaction between solar wind and Earth's magnetosphere, posing a huge threat to modern society's critical infrastructure. Historical events include:

  1. The 1989 Quebec blackout (6 million people lost power for 9 hours, with $2 billion in losses);
  2. The 2003 Halloween storm (30 satellites damaged, global high-frequency communication interrupted);
  3. The 1859 Carrington Event (if repeated, losses would range from $0.6 to $2.6 trillion). Affected fields: communication satellites, GPS, power grids, polar aviation communication, astronaut safety.
3

Section 03

System Architecture and Data Sources

The HELIOS platform integrates NASA/NOAA satellite data and includes 5 modules: orbital launch schedule, solar event monitoring, satellite tracking, AI prediction (GAIE), and solar energy optimization. GAIE solves the core problem: predicting the intensity of geomagnetic disturbances (KP index, G grade) in the next few hours based on solar wind data from the DSCOVR satellite (L1 point). Data sources are all government public APIs: NOAA SWPC's solar wind magnetic field/plasma/KP index, and NASA DONKI's flare/storm events.

4

Section 04

Data Engineering and Feature Design

The dataset contains 11249 records (9749 real DSCOVR data + 1500 synthetic data to supplement extreme storm samples). Feature engineering designed 20 features with clear physical meanings, such as bz_negativo (southward Bz component, magnetic reconnection channel), newell_coupling (energy transfer rate), pressao_dinamica (magnetosphere compression pressure), etc. Preprocessing steps: deduplication, outlier handling, time alignment, stratified division, and robust standardization.

5

Section 05

Model Selection and Performance Comparison

Regression task (KP index): XGBoost performed best (RMSE=0.2768, MAE=0.1704, R²=0.9678), outperforming Ridge (R²=0.82) and Random Forest. Classification task (G grade): XGBoost achieved an accuracy of 0.9787 and a weighted F1 score of 0.9772. Key insight: The relationship between solar wind and geomagnetic activity is nonlinear, and gradient boosting can better capture complex interactions.

6

Section 06

SHAP Interpretability: Physical Laws Learned by the Model

SHAP analysis was done using TreeExplainer, and the feature importance ranking is:

  1. bz_negativo (southward Bz, magnetic reconnection trigger);
  2. newell_coupling (energy transfer efficiency);
  3. CME (main cause of G3-G5 storms);
  4. Wind speed;
  5. Dynamic pressure. The results are consistent with plasma physics research, proving that the model captures real physical phenomena rather than statistical artifacts.
7

Section 07

Deployment, Applications, and SDG Alignment

GAIE is deployed as a Streamlit application (link: https://globalsolutiongenerativeai-gkw5rmitemjc8d7ue7mvub.streamlit.app). Its functional modules include real-time prediction, SHAP explanation, model metrics, and project introduction. Simulated extreme storm: Bz=-30nT + wind speed 750km/s + CME + M-class flare → KP7-8, G3-G4 grade. Alignment with UN SDGs: SDG9 (Protect Infrastructure), SDG13 (Climate Action), SDG11 (Sustainable Cities).

8

Section 08

Conclusion: Value and Significance of GAIE

GAIE combines machine learning and space physics knowledge to solve practical social value problems. The 97% R² and 98% F1 scores mean satellite operators, grid managers, etc., can get early warnings hours in advance and take protective measures to avoid billions of dollars in losses. Its technical highlights include end-to-end ML engineering, physics-informed ML, explainable AI, robust data strategy, and production-level deployment.