Zing Forum

Reading

OpenBet: A Football Prediction Engine Combining Statistical Modeling, Machine Learning, and Claude AI

OpenBet is a professional-level football match prediction system that integrates Poisson statistical models, XGBoost classifiers, real-time betting odds, and a large language model inference layer to identify high-value betting opportunities in top European leagues.

足球预测机器学习Claude AI泊松模型XGBoost体育博彩大语言模型自动化管道预测系统量化分析
Published 2026-05-24 00:02Recent activity 2026-05-24 00:50Estimated read 8 min
OpenBet: A Football Prediction Engine Combining Statistical Modeling, Machine Learning, and Claude AI
1

Section 01

OpenBet Project Overview

OpenBet is a professional-level football match prediction system that integrates Poisson statistical models, XGBoost classifiers, real-time betting odds, and the reasoning capabilities of large language models (such as Claude Sonnet). It aims to identify high-value betting opportunities in top European leagues. The system covers the entire workflow from data acquisition, feature engineering, model training to automated deployment. Its core design philosophy is "value first"—focusing on the difference between the model's predicted probability and the market's implied probability, and applying quantitative financial arbitrage ideas to handle sports betting decision-making problems.

2

Section 02

Project Background and Design Philosophy

Original Author and Source

  • Original Author/Maintainer: devon1910
  • Source Platform: GitHub
  • Original Title: OpenBet — Football Prediction Engine
  • Original Link: https://github.com/devon1910/OpenBet
  • Release/Update Time: 2026-05-23T16:02:55Z

Design Philosophy

OpenBet is centered on the "value first" principle. It not only predicts match outcomes but also focuses on identifying "value bets" where there is a significant difference between the model's predictions and the market's implied probabilities. It treats sports betting as a decision-making problem that can be modeled, calculated, and optimized.

3

Section 03

Technical Architecture and Core Methods

OpenBet adopts a four-layer progressive architecture to improve prediction accuracy and reliability:

  1. Poisson Statistical Model: Extended based on the Dixon-Coles model, it calculates team offensive/defensive strength, home advantage, and considers dynamic changes in recent form;
  2. XGBoost Classifier: Integrates over 20 features such as ELO ratings, xG, and historical head-to-head records to capture non-linear interaction relationships;
  3. Meta-Learner: Uses logistic regression stacking to fuse outputs from the first three layers, reducing bias and variance of a single model;
  4. LLM Inference Layer: Calls Gemini 2.5 Flash (with Claude Sonnet as backup), integrates text information like news and injuries, and dynamically adjusts prediction confidence within a ±0.10 range.
4

Section 04

Automation Engineering Implementation Details

Self-Healing Data Pipeline

Automatically synchronizes data (match results, feature reconstruction, odds acquisition, etc.) every 6 hours, with built-in health check mechanisms such as schema drift detection, stale feature reconstruction, and automatic retraining.

Intelligent Caching and Invalidation Strategy

Pre-generates prediction result caches for millisecond-level response on the read side; automatically triggers regeneration when odds are updated, balancing real-time performance and API call costs.

Progressive Threshold Relaxation

A four-level threshold mechanism that transitions from strict to relaxed, ensuring a maximum of 10 daily recommendations to balance quality and coverage.

5

Section 05

Data Sources and Tech Stack Selection

Data Source Integration

  • football-data.org: Match/team/score data (free version, 10 requests per minute)
  • odds-api.io: Odds and league coverage (100 requests per hour)
  • api-football: xG and injury information (100 requests per day)
  • Gemini API: Contextual reasoning (free quota)
  • Anthropic API: Backup reasoning (paid)

Tech Stack Selection

Layer Tech Selection
API Framework FastAPI (Python 3.11)
Database PostgreSQL + asyncpg + SQLAlchemy 2.0
Machine Learning XGBoost, scikit-learn, SciPy
AI Inference Gemini 2.5 Flash / Claude Sonnet 4
Task Scheduling APScheduler
Authentication & Authorization JWT + bcrypt
Frontend Tailwind CDN + Vanilla JS
6

Section 06

Performance Metrics and Backtest Results

According to the project documentation, OpenBet's backtest performance is as follows:

  • Accuracy: 81.5%
  • Return on Investment (ROI): 17%
  • Perfect Prediction Days: 20 out of 42 match days

Note: Past performance does not guarantee future returns; continuous monitoring and iterative optimization are required.

7

Section 07

Application Scenarios and Extensibility

The OpenBet architecture can be extended to multiple scenarios:

  1. Sports Event Analysis: Other sports such as basketball and tennis;
  2. Financial Market Prediction: Time series like stock prices and exchange rates;
  3. Risk Assessment: Credit scoring, insurance pricing;
  4. Decision Support: Decision scenarios combining structured data and text.
8

Section 08

Summary and Insights

OpenBet demonstrates the methodology of combining traditional machine learning with LLMs to build an end-to-end prediction system, covering best practices for the entire workflow from problem definition, data engineering, model selection to deployment.

For AI application developers, it is an excellent learning case: it proves that complex prediction tasks can be implemented as production-level systems through a reasonable architecture, while reminding us to pay attention to engineering issues such as data quality, system monitoring, and cost control, rather than just model tuning.