Zing Forum

Reading

STARAPTOR: Multi-center Renal Pathology Image Data Harmonization and Transplant Prognosis Prediction

This study introduces a harmonization research on multi-center renal pathology image data. By comparing six data harmonization methods, it addresses batch effects caused by differences in scanners and staining protocols across institutions, significantly improving the predictive accuracy of machine learning models for kidney transplant prognosis.

数据协调多中心研究肾脏病理机器学习ComBat批次效应肾移植医疗AI
Published 2026-05-28 09:15Recent activity 2026-05-28 09:20Estimated read 8 min
STARAPTOR: Multi-center Renal Pathology Image Data Harmonization and Transplant Prognosis Prediction
1

Section 01

STARAPTOR Project Introduction: Multi-center Renal Pathology Data Harmonization Improves Transplant Prognosis Prediction

The STARAPTOR project addresses the batch effect issue in multi-center renal pathology image data (systematic bias caused by differences in scanners and staining protocols across institutions). It systematically compares six data harmonization methods and finds that the ComBat method performs best, significantly improving the predictive accuracy of machine learning models for kidney transplant prognosis (eGFR, DGF), providing a methodological template for multi-center medical AI research.

2

Section 02

Batch Effect Challenges in Multi-center Medical Research

Single-institution datasets have limited sample sizes, so multi-center collaboration is an inevitable choice. However, technical differences between hospitals (scanners, tissue processing, staining protocols) introduce batch effects that mask real biological signals. Renal pathology is particularly sensitive: donor biopsy WSI requires precise feature quantification to predict transplant prognosis, but directly mixing data from UC Davis, University of Coimbra, and Mayo Clinic for training would make the model learn institutional artifacts rather than pathological patterns. The STARAPTOR project evaluates six harmonization methods for this purpose.

3

Section 03

Study Design and Comparison of Harmonization Methods

Data Sources and Prediction Objectives

  • Data: Donor kidney biopsy WSI radiomic features from UC Davis, University of Coimbra, and Mayo Clinic (165 matched features)
  • Prediction endpoints: 12-month post-transplant eGFR (regression), DGF (classification)

Six Harmonization Methods

Method Principle Applicable Scenario
Unharmonized Raw data without harmonization Baseline control
Z-Score Feature standardization (zero mean, unit variance) Simple linear offset correction
RAVEL Linear adjustment based on reference variables Known batch-related variables
CORAL Correlation alignment (second-order statistic matching) Differences in feature covariance structure
CovBat Covariate-adaptive batch correction Complex nonlinear batch effects
ComBat Empirical Bayesian batch correction Classic batch effect removal
4

Section 04

Experimental Results: Harmonization Methods Significantly Improve Predictive Performance

Aggregated Data Experiment

  • eGFR prediction: XGBoost+ComBat (MSE 239) reduced MSE by 32.3% compared to unharmonized (353)
  • DGF prediction: XGBoost+ComBat (AUC 0.961) increased AUC by 37.5% compared to unharmonized (0.699)

LOO Cross-Validation (Generalization Test)

  • eGFR: XGBoost+LOO ComBat (MSE372) reduced MSE by 25.5% compared to unharmonized (499)
  • DGF: XGBoost+LOO ComBat (AUC0.829) increased AUC by37.0% compared to unharmonized (0.605)

Key findings: ComBat/CovBat are the most stable; XGBoost benefits the most; harmonization must be applied during inference (Harm→Raw performs worse)

5

Section 05

Technical Implementation and Pipeline Workflow

Reproducible Pipeline Steps

Step Script Function
1 01_preprocess_data.py Load data, aggregate subjects, calculate outcomes
2 02_prepare_features.py Match features, align naming, impute missing values
2.5 02.5_alt_harm_methods.py Z-Score/CORAL and other harmonization
2.5 02.5_harmonize.Rmd ComBat/CovBat harmonization (R package)
3 03_loo_combat.py LOO ComBat + model training
3 03_train_models.py Full scenario training
3.5 03.5_mrmr_feature_selection.py Feature selection optimization
4 04_process_results.py Result aggregation
5 05_visualize.py Generate charts

Environment Configuration

  • Python: Preprocessing, ML modeling
  • R: ComBat/CovBat (ComBatFamQC package)
  • config.py required to specify data paths
6

Section 06

Methodological Insights and Clinical Significance

Reasons for ComBat's Optimal Performance

  1. Empirical Bayesian framework handles small sample batches
  2. Preserves biological signals
  3. Parameters are transferable (LOO validation)

Implications for Clinical AI Deployment

  1. Data harmonization is essential for multi-center research
  2. Method selection must match data characteristics
  3. LOO validation more strictly evaluates generalization ability
  4. Harmonization parameters can be transferred to new institutions
7

Section 07

Limitations and Future Directions

Limitations

  • Limited sample size
  • Dependence on predefined radiomic features
  • Does not cover emerging deep learning harmonization methods

Future Directions

  • Integrate deep learning features with ComBat harmonization
  • Develop harmonization methods specific to pathology images
  • Establish standardized multi-center data collection protocols
  • Explore applicability to other organ transplants
8

Section 08

Project Summary: Data Harmonization is Key to Multi-center Medical AI

The STARAPTOR project demonstrates the core role of data harmonization in multi-center medical AI: the ComBat method is optimal, significantly improving the accuracy of kidney transplant prognosis prediction, and maintaining advantages even in cross-center generalization. This study provides a methodological template for multi-center imaging AI, and data harmonization will be a key link in ensuring model reliability and fairness in clinical deployment.