Zing Forum

Reading

ETA Prediction Engine: A City Travel Time Estimation Solution Integrating Neural Networks and LightGBM

This article analyzes an open-source project for New York taxi travel time prediction, exploring how to use ensemble learning of neural networks and gradient boosting models to mine travel patterns from spatiotemporal data and achieve accurate arrival time estimation.

ETA预测神经网络LightGBM集成学习时空数据出行时间预估机器学习
Published 2026-05-16 05:50Recent activity 2026-05-16 06:06Estimated read 11 min
ETA Prediction Engine: A City Travel Time Estimation Solution Integrating Neural Networks and LightGBM
1

Section 01

ETA Prediction Engine: A City Travel Time Estimation Solution Integrating Neural Networks and LightGBM (Introduction)

This article analyzes an open-source project for New York taxi travel time prediction, exploring how to use ensemble learning of neural networks and LightGBM to mine travel patterns from spatiotemporal data and achieve accurate arrival time estimation. This solution combines the complementary advantages of the two types of models to address the ETA prediction challenges brought by the dynamics and complexity of urban traffic, which is of great value for improving travel service experience and operational efficiency.

2

Section 02

Business Value, Technical Challenges, and Dataset Background of ETA Prediction

Business Value

Accurate ETA prediction is of great significance to passengers (reducing anxiety), drivers (optimizing order dispatch), and platforms (intelligent scheduling, dynamic pricing). Every 1-minute reduction in error can lower the user cancellation rate.

Technical Challenges

  • Spatiotemporal heterogeneity: The travel time for the same distance varies greatly in different time periods/regions;
  • Intertwined multi-source factors: Traffic, weather, road types, etc., are difficult to quantify;
  • Data sparsity: There is little historical data in some regions/time periods;
  • Real-time requirements: Fast response is needed, and computationally intensive models cannot be used.

Dataset Background

The New York taxi dataset contains millions of trip records (pick-up/drop-off time, location, number of passengers, etc.), which is large-scale and real, but has quality issues such as abnormal coordinates and incorrect timestamps that need to be handled.

3

Section 03

Integration Strategy of Neural Networks and LightGBM

This project adopts an integration solution of neural networks and LightGBM, leveraging their complementary advantages:

  • Neural Networks: Good at automatically learning feature representations (e.g., spatial embedding, time periodicity) and fusing heterogeneous inputs;
  • LightGBM: Excellent performance in tabular data tasks, robust to outliers, fast training, and supports missing value handling;
  • Integration Value: Reduces variance, mitigates overfitting, and improves overall performance. Common strategies include simple averaging, weighted averaging, Stacking, or Blending. This project may use feature-level fusion (NN embeddings as LightGBM inputs) or model-level fusion (combining prediction results).
4

Section 04

Spatiotemporal Feature Engineering: Modeling Spatial Relationships and Temporal Patterns

Spatial Feature Engineering

  • Geocoding and zoning: Mapping coordinates to fixed grids, administrative boundaries, or clustered regions;
  • Spatial embedding learning: Using Word2Vec-like techniques to map region IDs to low-dimensional vectors to capture spatial semantics;
  • Distance and direction: Euclidean/Manhattan/road network distance, direction features (e.g., towards the city center);
  • Spatiotemporal interaction: Constructing origin-destination pair features or using attention mechanisms to learn relative relationships.

Temporal Feature Modeling

  • Time decomposition: Granularities such as hour, day of the week, whether it is a weekend/holiday;
  • Periodic encoding: Using sine/cosine encoding to handle time periodicity (e.g., the relationship between 23:00 and 01:00);
  • Historical/real-time traffic: Historical average speed of the same time period/road segment, real-time traffic conditions (if data permits).
5

Section 05

Model Architecture and Training Strategy

Neural Network Part

Multi-input architecture to process spatial (coordinates/regions), temporal (decomposed features), and contextual (number of passengers, weather) information; spatial/temporal features are converted into dense vectors via embedding layers, then concatenated and input into fully connected layers (3-5 layers, ReLU activation + Batch Normalization).

LightGBM Part

Trained using the same feature set (or NN embeddings); hyperparameter tuning (learning rate, tree depth, sampling strategy), often using Optuna/Grid Search for automatic tuning.

Integration and Training Strategy

  • Integration: Simple averaging or Stacking (meta-model combines base model outputs);
  • Loss function: RMSE, MAE, or custom (e.g., weighting overestimation/underestimation);
  • Training techniques: Cross-validation, early stopping, learning rate scheduling, class imbalance handling.
6

Section 06

Data Preprocessing and Model Evaluation

Data Preprocessing

  • Outlier handling: Delete/truncate abnormal coordinate, time, and speed records;
  • Missing value handling: Delete a small number of missing values or fill with mode/unknown category;
  • Feature scaling: Neural networks require standardization/normalization;
  • Prevent data leakage: Split training/test sets by time (avoid random splitting).

Evaluation Metrics

  • Technical metrics: RMSE (penalizes large errors), MAE (robust), MAPE (cross-scenario comparison), R² (explained variance);
  • Business metrics: Proportion of errors ≤5 minutes, overestimation/underestimation distribution, frequency of extreme errors;
  • Segmented evaluation: Evaluate model performance by time period, region, and distance separately.
7

Section 07

Key Considerations for Deployment and Online Services

Inference Latency Optimization

Fast response is required (<100ms); precompute features and use model serving frameworks (TensorFlow Serving, Triton).

Model Update

Regular retraining (due to changes in traffic patterns), automated process: Data collection → Feature calculation → Training → A/B testing → Gray release; monitor performance degradation to trigger retraining.

Interpretability

  • Feature importance analysis;
  • SHAP value decomposition of individual prediction contributions;
  • Partial dependence plots to show the relationship between features and predictions.

Cold Start Problem

When there is a lack of data for new regions/drivers, use rule fallback or transfer learning (learn from similar regions/drivers).

8

Section 08

Project Summary and Practical Insights

This project demonstrates the effectiveness of combining deep learning and traditional ML to solve practical business problems: neural networks automatically learn spatial embeddings, LightGBM efficiently uses structured features, and integration improves performance. For teams building similar systems, key insights include:

  1. Deeply understand the business scenario;
  2. Carefully design spatiotemporal features;
  3. Attach importance to data quality;
  4. Choose an appropriate integration strategy;
  5. Establish a continuous iterative model operation process.

With the enrichment of traffic data and advances in algorithms, ETA prediction accuracy will continue to improve, laying the foundation for intelligent travel services.