Zing Forum

Reading

A Systematic Comparative Study of 12 Classic CNN Architectures on Ishihara Color Blindness Test Chart Recognition Task

This project systematically compares the performance of 12 classic CNN architectures from LeNet-5 to DenseNet on the Ishihara color blindness test chart recognition task, including a complete framework for data generation, model training, GUI visualization, and comparative analysis.

卷积神经网络CNN色盲检测石原测试深度学习图像分类模型对比数据增强
Published 2026-06-06 13:13Recent activity 2026-06-06 13:24Estimated read 6 min
A Systematic Comparative Study of 12 Classic CNN Architectures on Ishihara Color Blindness Test Chart Recognition Task
1

Section 01

[Main Floor/Introduction] A Systematic Comparative Study of 12 Classic CNN Architectures on Ishihara Color Blindness Test Chart Recognition Task

This project systematically compares the performance of 12 classic CNN architectures from LeNet-5 to DenseNet on the Ishihara color blindness test chart recognition task, covering a complete framework for data generation, model training, GUI visualization, and comparative analysis. Core objectives include exploring the relationship between architecture depth and performance, the effectiveness of structural innovations, the substitution value of synthetic data, and the impact of safe data augmentation strategies, etc.

2

Section 02

Research Background and Problem Definition

Ishihara color blindness test charts are commonly used tools for color vision deficiency screening, composed of colored dots. People with normal color vision can identify numbers, while those with abnormalities find it difficult to distinguish them. From the perspective of computer vision, this task faces three major challenges: identifying subtle differences between similar hues, extracting spatial patterns from cluttered backgrounds, and processing multi-scale dots. This project aims to reveal the applicability of different designs to this task by comparing 12 milestone CNN architectures.

3

Section 03

Overview of Synthetic Dataset and Classic CNN Architectures

Due to limited real data, the project developed a synthetic dataset generator: defining number shapes via a 7×5 matrix, densely laying random-sized dots (distinguishing foreground/background color systems), adding color jitter and circular cropping, and designing 5 color schemes. Meanwhile, 12 CNN architectures are implemented, including LeNet-5 (1998, foundation of convolution + pooling), AlexNet (2012, ReLU + Dropout + GPU training), VGG series (stacked small convolution kernels), ResNet (residual connections), Inception (multi-scale convolution), DenseNet (dense connections), etc.

4

Section 04

Experimental Framework and Graphical User Interface

The project builds a grid search experimental framework that supports automatic traversal of parameter combinations such as models, learning rates, batch sizes, and optimizers, and performs K-fold cross-validation to ensure reliable results. In addition, a GUI is developed based on tkinter, covering functions such as dataset loading, model configuration, training visualization, multi-model comparison, and prediction, lowering the threshold for experiments.

5

Section 05

Key Challenges and Solutions

The project addresses three core challenges: 1. Data scarcity: generating large-scale training data via the synthetic data generator; 2. Semantic constraints of data augmentation: designing safe strategies to prohibit operations like flipping/rotating that destroy labels; 3. Coordination between GUI and training threads: putting training into background threads to avoid interface lag.

6

Section 06

Project Value and Summary

This project has reference value for deep learning learners (understanding the development context of CNNs), computer vision researchers (transfer of synthetic data and augmentation strategies), and medical image processing practitioners (inspiration for target extraction from complex backgrounds). In summary, the project forms a closed-loop process from data generation to analysis, and detailed designs (such as safe augmentation) reflect an in-depth understanding of task characteristics, providing a practical foundation for similar studies.