Section 01
Introduction: Machine Learning-Based Comparative Study of Behavioral Risk Factors for Tobacco Use
This project is an end-to-end data science workflow that uses multiple machine learning algorithms to analyze behavioral risk factors for tobacco use and predict the "upper limit of high confidence" indicator. The data is sourced from the Behavioral Risk Factor Surveillance System (BRFSS) from 2011 to the present. After comparing the performance of various algorithms, it was found that Random Forest and Support Vector Machine (SVM) performed the best, providing support for public health decision-making, medical research, and education.