Zing Forum

Reading

Machine Learning Practice for Skin Lesion Classification: A Comparative Study from Feature Extraction to Deep Fine-Tuning

This article introduces a multi-class skin lesion classification project based on the HAM10000 dataset, comparing the performance differences between two methods: frozen feature extraction + SVM and deep fine-tuning models, providing practical references for medical imaging AI applications.

皮肤病变分类HAM10000迁移学习深度学习医学影像SVMCNN微调计算机辅助诊断
Published 2026-06-16 04:16Recent activity 2026-06-16 04:23Estimated read 7 min
Machine Learning Practice for Skin Lesion Classification: A Comparative Study from Feature Extraction to Deep Fine-Tuning
1

Section 01

Machine Learning Practice for Skin Lesion Classification: Introduction to the Comparative Study of Two Methods

This article introduces a multi-class skin lesion classification project based on the HAM10000 dataset, systematically comparing the performance differences between two methods: frozen feature extraction combined with SVM and deep fine-tuning models, providing practical references for medical imaging AI applications. The original author of the project is Rafaela Mlucca, the source platform is GitHub, and the release date is June 15, 2026.

2

Section 02

Project Background and Significance

Skin cancer is one of the most common types of cancer globally, and early accurate diagnosis is crucial for prognosis. The HAM10000 dataset is an important benchmark in the field of skin lesion classification, containing seven common lesion types. The goal of this project is not only to implement a high-accuracy classification model but also to compare the advantages and disadvantages of different technical routes, providing data support for technical selection in practical applications.

3

Section 03

Comparison of Technical Routes: Frozen Feature Extraction + SVM vs. Deep Fine-Tuning

Frozen Feature Extraction + SVM

  • Process: Use a pre-trained CNN (e.g., ResNet/VGG) as a fixed feature extractor, output feature vectors to input into the SVM classifier
  • Advantages: Fast training speed, low computational cost, less prone to overfitting
  • Limitations: Pre-trained features may not adapt to the specificity of medical images and cannot be optimized for the task

Deep Fine-Tuning

  • Process: Fine-tune pre-trained model weights in stages, gradually releasing more layers for updates
  • Advantages: Can learn task-specific features and capture subtle differences in skin lesions
  • Limitations: Requires more data and computational resources, has a risk of overfitting
4

Section 04

Characteristics of the HAM10000 Dataset

Contains 10015 dermoscopic images covering seven lesion types:

  • Pigmented lesions: Melanocytic nevi (nv), Melanoma (mel)
  • Benign lesions: Seborrheic keratosis (bkl), Benign keratosis-like lesions
  • Inflammatory lesions: Dermatitis (df)
  • Vascular lesions: Vascular lesions (vasc)
  • Other malignant lesions: Basal cell carcinoma (bcc), Actinic keratosis (akiec)

The dataset has an imbalanced class distribution, so strategies such as oversampling, class weight adjustment, or focal loss need to be used for balance.

5

Section 05

Experimental Design and Evaluation Metrics

Data Partitioning: Stratified k-fold cross-validation Evaluation Metrics: Overall accuracy, Sensitivity (Recall), Specificity, F1 score, AUC-ROC, Confusion matrix Data Augmentation: Apply the same augmentation strategies (rotation, flipping, brightness adjustment, etc.) to both methods to ensure fair comparison.

6

Section 06

Result Analysis and Discussion

Theoretically, deep fine-tuning is expected to perform better when data and resources are sufficient, but frozen feature extraction + SVM has higher cost-effectiveness in scenarios where data or resources are limited. The choice of method needs to consider the following comprehensively:

  1. Data scale: Prioritize fine-tuning when sufficient
  2. Computational resources: Lightweight solutions are suitable for edge deployment
  3. Real-time performance: SVM has faster inference speed
  4. Interpretability: Features of traditional methods are easier to interpret.
7

Section 07

Considerations for Clinical Implementation

  • Interpretability: Class Activation Maps (CAM) highlight attention areas to enhance credibility
  • Uncertainty Quantification: Output confidence scores; low confidence suggests manual review
  • Continuous Learning: Support model updates without forgetting existing knowledge
  • Fairness: Ensure consistent performance across different skin tone groups.
8

Section 08

Summary and Outlook

This project provides empirical evidence for technical selection in skin lesion classification, and both methods have their applicable scenarios. Future explorations can include:

  • Semi-supervised learning to utilize unlabeled data
  • Multimodal fusion combining clinical metadata
  • Lightweight models for mobile deployment
  • Federated learning to protect patient privacy

We look forward to more open-source projects promoting the integration of medical AI with clinical needs.