Zing Forum

Reading

Combining Pre-trained CNNs with Classical Machine Learning: A Transfer Learning Practice for Medical Image Classification

This article introduces an undergraduate research project that explores the use of pre-trained convolutional neural networks (CNNs) as feature extractors combined with classical machine learning classifiers for medical data classification, demonstrating the application value and implementation path of transfer learning in the field of medical AI.

迁移学习医学图像CNN预训练模型特征提取机器学习ResNet医疗AI图像分类
Published 2026-04-30 22:46Recent activity 2026-04-30 22:58Estimated read 8 min
Combining Pre-trained CNNs with Classical Machine Learning: A Transfer Learning Practice for Medical Image Classification
1

Section 01

Introduction: Transfer Learning Practice for Medical Image Classification by Combining Pre-trained CNNs with Classical ML

This article introduces an undergraduate research project that explores the use of pre-trained convolutional neural networks (CNNs) as feature extractors combined with classical machine learning classifiers for medical image classification, demonstrating the application value and implementation path of transfer learning in the field of medical AI. The project addresses the unique challenges of medical image classification (data scarcity, high annotation cost, class imbalance, and high generalization requirements) and provides practical solutions.

2

Section 02

Research Background and Challenges in Medical Image Classification

Medical image analysis is a socially valuable application area of AI, which can assist physicians in interpreting images such as X-rays and CT scans to improve medical efficiency. However, it faces four major challenges:

  1. Data Scarcity: Medical data is subject to strict privacy protection, and the sample size is far smaller than ImageNet;
  2. High Annotation Cost: Requires the participation of professional physicians, which is time-consuming and expensive;
  3. Class Imbalance: Normal samples are far more than diseased samples, and models tend to favor the majority class;
  4. High Generalization Requirements: Errors may lead to serious consequences, so the reliability requirements are extremely high.
3

Section 03

Core Methodology and Technology Selection

Transfer Learning Strategies

The project adopts transfer learning and compares two strategies:

  • Fine-tuning: Continue training some layers of the pre-trained model to adapt to the task, but it is prone to overfitting;
  • Feature Extraction: Freeze the convolutional layers and use their output features to train an independent classifier, which is suitable for small data scenarios (this strategy is chosen for the project).

Pre-trained CNN Architectures

Evaluate classic architectures such as ResNet (residual connections solve gradient vanishing), VGG (regular structure), DenseNet (dense connections), and EfficientNet (efficient compound scaling).

Classical ML Classifiers

Use algorithms such as SVM (excellent for high-dimensional features), Random Forest (robust ensemble), Logistic Regression (fast baseline), and Gradient Boosting Tree (commonly used in competitions).

4

Section 04

Experimental Process and Result Analysis

Experimental Process

  1. Data Preprocessing: Normalization (size/pixel value), data augmentation (rotation/cropping), class balance (oversampling/SMOTE/class weights);
  2. Feature Extraction: Raw image → pre-trained CNN → global average pooling → feature vector → classifier;
  3. Classifier Training: Stratified K-fold cross-validation, hyperparameter tuning (grid/random search), regularization (L1/L2/early stopping).

Result Comparison

  • CNN Architectures: ResNet-50 performs the best, and EfficientNet is the most efficient;
  • Classifiers: SVM and Random Forest have similar performance, and ensemble methods are better than single models;
  • Key Findings: Pre-trained features significantly improve performance, the feature extraction strategy is better than fine-tuning in small data scenarios, and classical ML combined with good features is still effective.
5

Section 05

Special Considerations for Medical Applications and Practical Insights

Special Medical Considerations

  • Interpretability: Generate heatmaps using Grad-CAM, analyze feature importance, and examine error cases;
  • Uncertainty Quantification: Prediction probability, ensemble variance, outlier detection;
  • Fairness: Ensure the model performs consistently across different populations.

Practical Insights

  • Technology Selection: Pre-trained models are a must, prioritize feature extraction for small data, do not ignore classical ML, and focus on interpretability;
  • Engineering Key Points: Prioritize data quality, strict cross-validation, in-depth error analysis, and integration with clinical practice.
6

Section 06

Project Limitations and Improvement Directions

Current Limitations

  1. Small dataset size;
  2. Only focuses on binary classification tasks;
  3. Does not use the latest architectures such as Vision Transformer.

Improvement Directions

  1. Use public medical datasets (e.g., MIMIC-CXR);
  2. Multimodal fusion (imaging + clinical text);
  3. Use self-supervised learning to leverage unlabeled data;
  4. Try visual transformers such as Swin Transformer;
  5. Use federated learning to utilize multi-center data while protecting privacy.
7

Section 07

Conclusion: Project Value and Significance

This project demonstrates the effectiveness of combining deep learning with traditional ML to solve medical image classification problems. Its core contributions include:

  1. Method Practicality: The feature extraction + classical ML strategy is simple and effective, suitable for resource-constrained scenarios;
  2. Systematic Comparison: Comprehensive evaluation of CNN architecture and classifier combinations;
  3. Domain Adaptation: Design solutions targeting the unique characteristics of medical data. It is an excellent reference for medical AI beginners, proving that reasonable method selection and system design can achieve valuable results.