# Application of Explainable Machine Learning in Children's Autism Screening: From Black Box to Transparent Medical AI

> A screening project for Autism Spectrum Disorder (ASD) that combines SHAP explainability technology with machine learning not only achieves an accuracy rate of 98.3% but also makes the AI decision-making process transparent and traceable, providing an important reference paradigm for clinical AI applications.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-11T05:26:19.000Z
- 最近活动: 2026-05-11T05:29:18.465Z
- 热度: 141.9
- 关键词: 自闭症筛查, 可解释AI, SHAP, 机器学习, 医疗AI, ASD, 行为分析, 数据科学
- 页面链接: https://www.zingnex.cn/en/forum/thread/ai-edafabe3
- Canonical: https://www.zingnex.cn/forum/thread/ai-edafabe3
- Markdown 来源: floors_fallback

---

## [Introduction] Explainable AI Empowers Children's Autism Screening: From Black Box to Transparent Medical Practice

This article introduces an Autism Spectrum Disorder (ASD) screening project called autism-screening-explainability, which combines SHAP explainability technology with machine learning. The project not only achieves an accuracy rate of 98.3% but also makes the AI decision-making process transparent and traceable through SHAP technology, providing an important reference paradigm for clinical AI applications. The project aims to solve the trust problem of medical AI, help doctors and parents understand the basis of AI judgments, and promote the credible application of AI in the medical field for special children.

## Project Background: Clinical Pain Points of ASD Screening and Data Foundation

Early screening for ASD is a major challenge in pediatric medicine. Traditional methods rely on doctors' experience and questionnaires, but timely diagnosis is difficult in resource-poor areas. Machine learning brings new possibilities for screening, but the "black box" problem leads to insufficient trust. This project stems from clinical needs: to make the model both accurate and able to explain behavioral indicators to support personalized intervention. The project uses the UCI Autism Screening Dataset for Children (292 children, 21 features, including the A1-A10 behavioral questionnaire from Q-CHAT). The data has a balanced category distribution (151 negative /141 positive), providing a foundation for model training.

## Technical Methods: Rigorous ML Process and Multi-Model Comparison

The project adopts a rigorous process: first, data leakage detection is performed (removing features highly correlated with the target variable, such as 'result' and 'age_desc'); feature engineering emphasizes standardization (StandardScaler), and it was found that the accuracy of SVM increased from 50.8% to 98.3% after standardization. Six models were compared: Logistic Regression and SVM (accuracy 98.3%, AUC-ROC 1.0), Random Forest (96.6%, 0.997), Gradient Boosting Tree (91.5%), Decision Tree (89.8%), K-Nearest Neighbors (83.1%). Hierarchical K-fold cross-validation was used for evaluation to ensure generalization performance.

## SHAP Explainability: The Key to Unlocking the AI Black Box

The project introduces SHAP (a game theory-based feature attribution method) to quantify the contribution and direction of features. Key insights: 1. The importance of the A1-A10 behavioral questionnaire items far exceeds that of demographic features, which is consistent with the focus of clinical ABA therapy; 2. Individual predictions can be explained (e.g., the absence of A3-directed behavior, the lack of A4 interest sharing, and the presence of A10 purposeless staring increase the risk); 3. Visual charts (stored in outputs/) make it easy for clinical staff to understand and provide guidance for intervention.

## Clinical Significance: Providing a Replicable Paradigm for Medical AI Implementation

The project provides a replicable paradigm for clinical AI applications: For data scientists, it demonstrates the standard process of medical AI (leakage detection, correlation analysis, standardization, cross-validation, explainability); For clinicians, it proves that AI can be transparent, and SHAP results and empirical verification build trust; For families of special children, the transparent diagnosis process helps parents understand the basis of risk and facilitates daily observation of their children's development.

## Limitations and Prospects: Next Steps from Research to Clinical Application

The project is for educational and research purposes and not for clinical diagnosis. Limitations: small dataset size (292 cases), lack of multi-center validation, and no inclusion of deep learning model comparison. Future directions: integrate larger and more diverse datasets, develop a web interface for non-technical users, explore the combination of deep learning and explainability, and conduct prospective clinical validation studies.
