Section 01
Guide to the Machine Learning-Based MLB Baseball Game Prediction System
This project is a production-grade MLB game prediction system developed by Roman Esquibel. By integrating multi-source data such as real-time Statcast data, historical team/player performance, pitcher trends, and recent team momentum, it builds an end-to-end automated machine learning pipeline to generate daily game win probability predictions. The system covers the entire process from data scraping, feature engineering, model training to prediction output, and can be applied to scenarios like sports betting decision-making, team analysis, and education. It features modularity, no data leakage, and full automation.