Zing Forum

Reading

Impact of Missing Data on Model Inference: A Stability Study Based on Explainable AI

A study on how missing data affects the inference process of machine learning models, using explainable AI (XAI) techniques to analyze the decision stability of models when facing incomplete data, providing a theoretical basis for data quality assessment in practical applications.

缺失数据可解释AIXAI模型鲁棒性SHAP机器学习数据质量模型解释特征重要性AI稳定性
Published 2026-04-20 23:03Recent activity 2026-04-20 23:27Estimated read 9 min
Impact of Missing Data on Model Inference: A Stability Study Based on Explainable AI
1

Section 01

[Main Floor] Introduction to Impact of Missing Data on Model Inference: A Stability Study Based on Explainable AI

This study focuses on the prevalent missing data problem in the real world, systematically exploring its impact on the decision stability of machine learning models through explainable AI (XAI) techniques. Core viewpoints include: missing data may not only reduce model accuracy but also lead to unstable inference processes; high prediction accuracy does not necessarily mean reliable explanations; missing patterns (rather than just proportions) have a greater impact on models; there are significant differences in the robustness of different types of models to missing data. The study provides a theoretical basis and practical framework for data quality assessment and trustworthy AI construction.

2

Section 02

Research Background and Motivation

Prevalence of Missing Data

Missing data arises from various reasons and is classified into three types: Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR). Traditional processing methods (such as deletion and mean imputation) focus on data completion but ignore the impact of missingness on model inference.

Rise of Explainable AI

The application of deep learning in key fields has promoted the development of XAI techniques (e.g., SHAP, LIME), which can reveal the decision logic of models.

Research Innovation

This study extends XAI from explaining individual predictions to evaluating inference stability under missing data, filling gaps in related fields.

3

Section 03

Research Methodology

Core Questions

  1. Changes in prediction accuracy; 2. Stability of feature importance ranking; 3. Decision boundary drift; 4. Maintenance of explanation consistency.

Experimental Design

  • Datasets: UCI medical, financial, and social survey datasets (10-100+ dimensions, thousands to tens of thousands of samples);
  • Model Types: Traditional ML (random forest, SVM, etc.), deep learning (MLP, etc.), ensemble models (XGBoost, etc.);
  • Missing Simulation: Random feature missing (10%/30%/50%/70%), structured missing, progressive missing;
  • XAI Methods: SHAP, Permutation Importance, Partial Dependence Plots.

Stability Indicators

  • Prediction stability: Prediction variance, confidence change;
  • Explanation stability: Kendall Tau coefficient (feature ranking), Jensen-Shannon divergence (SHAP distribution);
  • Decision boundary stability: Geometric change, adversarial sample sensitivity.
4

Section 04

Key Findings

Finding 1: Decoupling of Accuracy and Explanation Stability

Models may maintain high accuracy but have unstable explanations (e.g., the AUC of a medical model only drops by 0.07, but feature ranking changes significantly), suggesting that models may rely on spurious correlations.

Finding 2: Differences in Model Robustness

  • Tree ensemble models: Robust to random missing, sensitive to structured missing, and feature importance is prone to spikes;
  • Neural networks: Sensitive to continuous feature missing, explanation stability decreases faster, but can implicitly compensate through feature representation.

Finding 3: Feature Importance Illusion

When key features are missing, models shift weights to proxy variables or amplify noise features, leading to incorrect explanations (e.g., relying on non-causal features in medical diagnosis).

Finding 4: Missing Patterns Matter More

10% missing key features has a greater impact than 50% missing marginal features; structured missing > random missing; MNAR patterns have the most severe bias.

5

Section 05

Practical Implications and Recommendations

For Model Developers

  1. Incorporate missing robustness into model selection criteria, simulate missing scenarios to evaluate stability curves;
  2. Monitor XAI stability indicators, trigger alerts when feature ranking changes significantly;
  3. Quantify uncertainty (e.g., Bayesian NN, MC Dropout).

For Business Applications

  1. Build a data quality dashboard to monitor missing ratios and patterns;
  2. Layered decision-making: Direct prediction for high completeness, manual review for medium completeness, reject prediction for low completeness;
  3. Label explanation credibility, remind of potential risks when data is incomplete.
6

Section 06

Limitations and Future Directions

Current Limitations

  1. Static analysis: Does not involve the impact of missing data during the training process;
  2. Feature independence assumption: Does not fully consider complex correlations in real data;
  3. XAI method limitations: SHAP and others have computational costs and approximation errors.

Future Directions

  1. Active missing handling: Decide which features to collect (balance cost and gain);
  2. Causal perspective: Distinguish between correlation and causality to avoid explanation illusions;
  3. Dynamic missing adaptation: Develop adaptive models to adjust inference strategies;
  4. Human-in-the-loop: Optimize feature query strategies (reduce cognitive burden).
7

Section 07

Summary

This study reveals the deep impact of missing data on model inference through rigorous experiments: it not only reduces accuracy but also undermines inference stability. The core conclusion is that high accuracy does not equal reliable explanation; in key applications, both prediction performance and explanation stability need to be monitored simultaneously. Data quality should run through the entire model lifecycle, and understanding model behavior when information is incomplete is a necessary step to build trustworthy AI. This study provides warnings and evaluation frameworks for related practices.