Zing Forum

Reading

Aero Engine Remaining Useful Life Prediction: A Machine Learning Practical Project Based on NASA C-MAPSS Data

A production-grade machine learning project that uses the XGBoost model to predict the remaining useful life (RUL) of aero turbofan engines, including a complete training process, FastAPI inference service, Docker containerization, and AWS deployment solutions.

机器学习预测性维护航空发动机XGBoostNASA C-MAPSSFastAPIDockerAWS时间序列预测工业AI
Published 2026-06-09 18:44Recent activity 2026-06-09 18:56Estimated read 9 min
Aero Engine Remaining Useful Life Prediction: A Machine Learning Practical Project Based on NASA C-MAPSS Data
1

Section 01

[Introduction] Aero Engine Remaining Useful Life Prediction: A Machine Learning Practical Project Based on NASA C-MAPSS Data

Core Project Overview

This project is a production-grade machine learning practical project aimed at predicting the Remaining Useful Life (RUL) of aero turbofan engines based on the NASA C-MAPSS dataset. The project uses the XGBoost model to build a prediction system and provides a complete engineering solution, including FastAPI inference service, Docker containerization, and AWS deployment solutions.

Project Basic Information

  • Original Author/Maintainer: kallurivenkatesh4416-commits
  • Source Platform: GitHub
  • Project Link: aero-rul-predictor
  • Release Time: June 9, 2026

The core value of the project lies in mapping prediction results to risk levels, providing intuitive guidance for aviation maintenance decisions, and promoting the implementation of predictive maintenance in the aviation industry.

2

Section 02

Project Background and Significance

Project Background and Significance

Aero engine maintenance is a core part of aviation safety. The traditional periodic maintenance model has inefficiency issues: replacing components too early causes resource waste, while replacing them too late brings safety risks. Predictive maintenance, which predicts remaining useful life by analyzing sensor data, has become a technical trend in the aviation industry.

This project builds a complete production-grade solution based on the publicly available NASA C-MAPSS dataset, predicts engine RUL and maps it to risk levels, providing data support for maintenance decisions.

3

Section 03

Data Foundation and Model Design

Data Foundation and Model Design

Data Source

Using the FD001 subset of the NASA C-MAPSS dataset, which includes:

  • 100 training engines (operated until failure), 100 test engines (stopped before failure)
  • 21 sensor readings +3 operational setting parameters
  • Records the complete degradation time series of engines from health to failure

Model and Preprocessing

Adopting the XGBoost regression model, the preprocessing strategies include:

  • Split training/validation data by engine ID to avoid data leakage
  • Cap the value of RUL >125 cycles to 125, focusing on the critical degradation stage
  • Automatically remove irrelevant features such as constant sensors, ID, and cycle counter

Model Performance

  • MAE (Mean Absolute Error): 12.12 cycles
  • RMSE (Root Mean Square Error):16.76 cycles
  • R² Score:0.839
4

Section 04

Risk Classification System

Risk Classification System

The project converts numerical predictions into 3 practical risk levels to facilitate quick decisions by maintenance teams:

Risk Level Predicted RUL Maintenance Advice
Critical ≤30 cycles Arrange maintenance immediately
Warning 31-70 cycles Plan maintenance in the near future
Healthy >70 cycles Continue monitoring operation

This classification allows non-technical personnel to make maintenance decisions without deep diving into model details.

5

Section 05

Engineering Implementation Highlights

Engineering Implementation Highlights

FastAPI Inference Service

  • Provides health check and prediction endpoints (receives sensor data and returns RUL and risk level)
  • Automatically generates Swagger UI documentation, supporting request/response data validation

Docker Containerization

  • Packaged as a Docker image, supporting local rapid deployment, cloud environment elastic scaling, and dependency isolation

AWS Cloud Deployment

  • Successfully deployed on AWS EC2, with architecture including Dockerized XGBoost service, FastAPI backend, and interactive web console

Continuous Integration

  • GitHub Actions CI pipeline: automatically executes dependency checks, syntax compilation, pytest tests, etc.
6

Section 06

Practical Application Value

Practical Application Value

For the Aviation Industry

  1. Cost Reduction: Shift from periodic maintenance to predictive maintenance, reducing unnecessary component replacements
  2. Safety Improvement: Identify potential faults in advance to avoid serious accidents
  3. Efficiency Optimization: Arrange maintenance plans reasonably to reduce aircraft downtime

For ML Practitioners

  1. Complete Reference: End-to-end process (data acquisition → cloud deployment)
  2. Best Practices: Data leakage prevention, feature selection, model persistence, API design
  3. Reusable Components: Training process, inference service, and Docker configuration can be migrated to other predictive maintenance scenarios
7

Section 07

Technical Insights and Key Takeaways

Technical Insights and Key Takeaways

  1. Data Leakage Prevention: Split data by engine ID to avoid the model seeing future data of the same engine, simulating real deployment scenarios
  2. Target Engineering: Cap RUL at 125 cycles to allow the model to focus on the critical stage of maintenance decisions
  3. Business Translation: Map continuous prediction values to discrete risk levels to realize the transformation from technical output to business decisions
8

Section 08

Expansion Possibilities and Future Directions

Expansion Possibilities and Future Directions

Based on this project framework, further exploration can be done:

  1. Multi-Condition Modeling: Use other C-MAPSS subsets (FD002-FD004) to cover different operating conditions and fault modes
  2. Deep Learning: Try time-series models such as LSTM and Transformer to capture more complex degradation patterns
  3. Uncertainty Quantification: Provide confidence intervals for predictions to help assess risks
  4. Real-Time Stream Processing: Integrate Kafka/Flink to realize real-time prediction of sensor data