Zing Forum

Reading

Research on Time Series Prediction and Ensemble Modeling Under Wind Turbine Icing Conditions

This wind turbine power prediction project under icing conditions is based on SCADA data. It comprehensively uses algorithms such as Random Forest, SVR, CNN, LSTM, and Transformer, combined with Stacking ensemble learning and KMeans working condition classification, to achieve high-precision prediction of wind turbine operating power.

风机结冰预测时间序列集成学习SCADA数据机器学习LSTMTransformerStacking
Published 2026-06-03 11:14Recent activity 2026-06-03 11:18Estimated read 9 min
Research on Time Series Prediction and Ensemble Modeling Under Wind Turbine Icing Conditions
1

Section 01

[Introduction] Overview of Core Content of Wind Turbine Power Prediction Research Under Icing Conditions

This project is a research on wind turbine power prediction under icing conditions based on SCADA data (a major experiment for the Machine Learning Introduction course). The original author Jiaxin2006 published it on GitHub in June 2026 (project link: https://github.com/Jiaxin2006/wind-turbine-icing-forecast). The core goal is to achieve high-precision prediction of wind turbine operating power by comprehensively using algorithms such as Random Forest, SVR, CNN, LSTM, and Transformer, combined with Stacking ensemble learning and KMeans working condition classification, to solve the problem of large errors in traditional unified modeling under icing conditions.

2

Section 02

Research Background and Problem Definition

Wind power is an important part of clean energy, but icing on wind turbine blades in low-temperature and high-wind environments changes aerodynamic characteristics, significantly affecting power output and bringing challenges to wind farm power prediction and dispatching. When data distribution changes under abnormal conditions such as icing, traditional unified modeling methods have sharply increased prediction errors and reduced generalization ability. Therefore, conducting special power prediction research for icing conditions has engineering application value and methodological significance. This study takes the wind turbine operating power OT as the prediction target, defines the working condition as the data state determined by temperature, wind speed, and historical operating status, and the icing-related working condition as a special operating state induced by low temperature.

3

Section 03

Data Source and Preprocessing

The data comes from the real wind turbine operation data of Chengde Power Supply Company of State Grid Jibei Electric Power Co., Ltd. for the entire month of February 2024, with a total of 41760 minute-level time-series observations, including fields such as timestamp, temperature, wind speed, and OT. Feature engineering adopts a multi-level strategy: environmental variables (temperature, wind speed), time-series features (using historical information through sliding windows), and rolling statistical features (describing local change trends). Preprocessing strictly divides the training/validation/test sets in chronological order to effectively avoid future information leakage.

4

Section 04

Algorithm Selection and Model Architecture

This project uses multiple types of algorithms to build a comparison system:

  • Baseline models: Random Forest (improves stability through ensemble voting), SVR (handles nonlinear relationships with kernel methods);
  • Sequence models: CNN (extracts local time-series patterns), LSTM (captures long-term time dependencies), Transformer (parallel processing of long sequences with self-attention mechanism);
  • Ensemble and extension: Stacking ensemble (uses base learner results as meta-learner input), KMeans working condition classification (unsupervised clustering for condition-specific modeling), CNN-LSTM-Attention hybrid model (combines convolution, sequence modeling, and attention mechanism).
5

Section 05

Experimental Design and Evaluation Metrics

The experimental process follows scientific methodology: 1. Basic comparison experiment (performance differences between traditional regression and deep sequence models); 2. Ensemble optimization experiment (performance improvement of Stacking ensemble); 3. Working condition division experiment (impact of KMeans clustering-based condition modeling on icing state prediction); 4. Extension experiment (hybrid model and statistical tests). Evaluation metrics include MAE (Mean Absolute Error), RMSE (Root Mean Square Error), and MAPE (Mean Absolute Percentage Error), supplemented by prediction curves, residual distribution, condition-specific error analysis, and other supplementary methods.

6

Section 06

Engineering Value and Methodological Significance

Engineering application value: Establishes a modeling process that balances accuracy and interpretability, provides technical support for wind farm power prediction and dispatching decisions, helps operators deal with icing conditions in advance, reduces power generation losses, and improves equipment operation safety. Methodological contributions: Clarifies the applicability differences between traditional models, deep sequence models, and ensemble models in wind turbine power prediction; verifies the improvement effect of Stacking and working condition division on model robustness; systematic comparative analysis has important methodological significance.

7

Section 07

Project Insights and Extended Reflections

This project demonstrates the machine learning application paradigm in industrial scenarios: from problem definition, data acquisition, feature engineering to model selection, experimental design, and result evaluation, it is necessary to combine domain knowledge and algorithm principles. Worthwhile points to learn include: 1. Introduce the concept of working conditions, identify implicit patterns through unsupervised learning and perform condition-specific modeling; 2. Pay attention to data leakage issues, strictly divide datasets in chronological order to ensure result credibility. These ideas have reference value for solving similar industrial prediction problems.