# UFC Fight Prediction System: How to Surpass Academic Papers in Combat Sports Prediction Using Machine Learning

> A UFC fight prediction system based on rolling feature engineering and five-model ensemble achieves 68.45% accuracy on unseen data from 2023 to 2026, surpassing the best paper result of ACM ICIIP 2024.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-14T19:56:03.000Z
- 最近活动: 2026-05-14T19:59:00.923Z
- 热度: 154.9
- 关键词: UFC, 机器学习, 格斗预测, 时序特征工程, 模型集成, 体育预测, XGBoost, CatBoost, 滚动特征, 数据泄露
- 页面链接: https://www.zingnex.cn/en/forum/thread/ufc
- Canonical: https://www.zingnex.cn/forum/thread/ufc
- Markdown 来源: floors_fallback

---

## UFC Fight Prediction System: Core Achievements of Surpassing Academic Papers with Machine Learning

This article introduces an open-source UFC fight prediction system. Through strict temporal feature engineering and a five-model ensemble strategy, it achieves a prediction accuracy of 68.45% on unseen data from 2023 to 2026, surpassing the best academic result (66.71%) published in ACM ICIIP 2024. This system addresses the data leakage issue in sports prediction and provides a prediction solution applicable in real-world scenarios.

## Project Background and Core Challenges

UFC prediction faces two core challenges:
1. **Data Leakage Issue**: Existing models often use future information, leading to excellent backtesting results but actual failure. This project uses strict temporal segmentation (training up to before 2023, testing covering 2023-2026) to eliminate leakage.
2. **Dynamic Changes in Fighters' States**: Career average statistics cannot reflect temporal changes in skills and physical fitness; rolling feature calculation methods are needed to capture dynamics.

## Core Methods: Model Ensemble and Feature Engineering

### Five-Model Ensemble Architecture
The system uses the average of prediction probabilities from five models:
- XGBoost (500 trees, depth 3, learning rate 0.01)
- LightGBM (500 trees, depth 3, learning rate 0.01)
- Random Forest (500 trees, depth 6)
- Logistic Regression (standardized scaling)
- CatBoost (500 rounds, depth 3, learning rate 0.01)

### Rolling Feature Engineering
Features are calculated using only historical data before the match date, covering:
- Career performance (streak difference, total win difference, etc.)
- Physical attributes (height difference, reach difference, etc.)
- Offensive efficiency (strikes per minute difference, etc.)
- Strike distribution (head/body strike ratio difference, etc.)
- State decay (strike accuracy change rate, etc.)
- Finishing/defensive ability (finishing rate difference, takedown defense success rate difference, etc.)

### Other Innovative Methods
- **Style Collision Quantification**: Position/target style distance, wrestling advantage, etc., to depict counter relationships
- **Market Information Integration**: Rankings, implied probability difference from odds, etc., provide incremental value (ELO rating has poor performance)

## Experimental Evidence and Academic Comparison

### Confidence Stratification Performance
| Tier | Confidence Threshold | Historical Accuracy | Backtest Return Rate |
|------|---------------------|---------------------|----------------------|
| High Confidence | 80%+ | 89.9% | +3.3% |
| Medium-High Confidence |75%+ |86.6% |+4.2% |

### Temporal Leakage Verification
Extreme time span test (training before 2020, testing after 2024) still maintains 65.91% accuracy, proving no leakage; the version with leakage (v9) has 63.04% accuracy but fails in practice.

### Feature Ablation Experiment
- Ineffective attempts: Sliding window (-0.61pp), exponential decay (-0.73pp), weight class split training (+0.06pp)
- Effective supplements: Defensive features (+0.12pp), reversal move features (+0.12pp)

### Academic Comparison
| Study | Accuracy | Method | Limitation |
|-------|----------|--------|------------|
| Yan et al. (ICIIP2024) |66.71% |GBDT |No temporal segmentation |
| This Project |68.45% |Five-model Ensemble |Strictly no leakage |

## Interactive Application Features

The project provides a Streamlit web application:
1. **Upcoming Event Prediction**: Automatically loads the next UFC event, obtains real-time odds, and sorts prediction results by confidence
2. **Custom Match Prediction**: Select two fighters from a library of 2241 fighters for a match, supports inputting odds for comparison, and displays a statistical data comparison table

Requires The Odds API key (free version allows 500 requests per month); prediction is still possible without a key.

## Implications for Sports Prediction Practice

1. **Temporal leakage is a core trap**: Strict temporal segmentation and rolling features are the foundation of real performance
2. **Feature engineering takes priority over model complexity**: The improvement from a single XGBoost (66.02%) to ensemble (68.45%) comes from feature refinement
3. **Domain knowledge creates differences**: Features like style collision and state decay rely on understanding combat sports
4. **Confidence stratification is key**: Ignoring reliability stratification leads to failure in high-risk scenarios

## Conclusion: System Value and Practical Significance

This system demonstrates the potential of machine learning in sports prediction. The core strategies (temporal features, multi-dimensional features, model ensemble, confidence stratification) form a usable framework. It provides practitioners with verified technical paths and lessons learned, emphasizing reproducibility and practicality in real-world scenarios.
