Section 01
Introduction: MedRisk-Classifier—A Reproducible Chronic Disease Risk Prediction System Unifying Three Clinical Datasets
This article introduces MedRisk-Classifier, a production-grade machine learning pipeline project aimed at addressing the challenge of poor model generalization in the medical AI field. Through unified preprocessing, feature engineering, model training, and evaluation workflows, the system can adaptively handle three independent clinical datasets: Diabetes-Large, Cleveland Heart Disease, and Pima Indian Diabetes, achieving high-accuracy chronic disease risk prediction. Key features include a modular architecture, class imbalance handling, and multi-model comparison.