# RetinaScan-UNet: A Deep Learning Solution for Retinal Vessel Segmentation Based on U-Net

> A project from the Artificial Intelligence course at the University of Seville, using a custom U-Net architecture and 5-fold cross-validation to achieve high-precision automatic segmentation of retinal vessels on the DRIVE 2004 dataset.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-05T16:40:33.000Z
- 最近活动: 2026-06-05T16:50:31.275Z
- 热度: 163.8
- 关键词: U-Net, retinal vessel segmentation, deep learning, medical imaging, DRIVE dataset, Keras, 计算机视觉, 医学影像, 深度学习, 图像分割
- 页面链接: https://www.zingnex.cn/en/forum/thread/retinascan-unet-u-net
- Canonical: https://www.zingnex.cn/forum/thread/retinascan-unet-u-net
- Markdown 来源: floors_fallback

---

## Introduction: Core Overview of the RetinaScan-UNet Project

RetinaScan-UNet is a project from the Artificial Intelligence Software Engineering course (2025/2026 academic year) at the University of Seville, maintained by marcosayalab and released on June 5, 2026. The open-source link is https://github.com/marcosayalab/RetinaScan-UNet. This project uses a custom U-Net architecture and 5-fold cross-validation to achieve high-precision automatic segmentation of retinal vessels on the DRIVE 2004 dataset, providing a complete solution for medical image segmentation.

## Project Background and Significance

Retinal vessel segmentation is a fundamental task in computer-aided diagnosis and visual healthcare. Accurate extraction of vessel structures can provide ophthalmologists with quantitative biomarkers for monitoring and diagnosing diseases such as diabetic retinopathy (microaneurysms, neovascularization), hypertensive retinopathy (arteriolar stenosis), glaucoma (vascular changes related to optic nerve damage), and age-related macular degeneration (assessment of vascular integrity). Due to the small size of medical image datasets and high annotation costs, the project integrates techniques such as data augmentation, patch-based training, skip connection architecture, cross-validation, and post-inference processing to achieve high-quality vessel extraction with controllable computational costs.

## Dataset Introduction: DRIVE 2004 Benchmark Dataset

The project uses the DRIVE 2004 dataset (a widely used benchmark in the field of retinal vessel segmentation), which contains 40 retinal fundus images of 584×565 pixels, divided into 20 training images and 20 test images. Each image includes the original RGB fundus image, a Field of View (FoV) mask, and vessel segmentation annotations from two independent experts. The presence of the second annotator allows measuring inter-observer differences, providing a realistic upper limit reference for algorithm performance.

## Technical Architecture: Detailed Explanation of Custom U-Net

The project implements a custom U-Net architecture (encoder-decoder structure) based on the Keras Functional API:

**Encoder Module**: Contains two 3×3 Conv2D layers (ReLU activation, He initialization) + a 2×2 MaxPooling2D layer. Its functions are to extract semantic information, increase the receptive field, and reduce spatial dimensions.

**Decoder Module**: Uses UpSampling2D/Conv2DTranspose for upsampling, skip connections (Concatenate operation), and additional convolution refinement layers. Its functions are to restore spatial resolution and improve localization accuracy.

**Advantages of Skip Connections**: Preserve fine vessel information, obtain clear boundaries, and better reconstruct capillary structures.

## Training Strategy and Optimization Measures

### Patch-Based Training
High-resolution images are divided into 128×128/256×256 patches, with dynamic padding to ensure size compatibility, and padding is removed during reconstruction. Advantages: Increase dataset size, reduce memory requirements, support larger batches, and improve training stability.

### Data Augmentation
Real-time application of horizontal flip, vertical flip, random rotation, and contrast perturbation to increase data diversity with low memory usage.

### 5-Fold Cross-Validation
Five independent data splits, with training/validation set rotation to reduce sampling bias and improve model robustness.

### Training Configuration
Optimizer: Adam; Loss function: Binary Cross-Entropy; Framework: Keras 3/TensorFlow; Callback: EarlyStopping.

## Optimization Techniques in Inference Phase

### Test-Time Augmentation (TTA)
Apply transformations such as flipping and rotation to input images, then average the prediction results to improve generalization ability and stability.

### Adaptive Otsu Thresholding
Binarize the model's output probability map, automatically determine the optimal threshold to adapt to different image brightness distributions.

### Morphological Post-Processing
Apply opening and closing operations to remove noise points, fill small holes, and smooth vessel boundaries, improving segmentation quality.

## Project Features and Innovations

1. **End-to-End Complete Pipeline**: A full-process solution covering data preprocessing, model training, and inference optimization.
2. **Medical Image-Specific Optimization**: Deeply optimized for the special challenges of retinal vessel segmentation.
3. **Computational Resource-Friendly**: Designed to run efficiently on standard consumer-grade hardware.
4. **Balanced Academic and Practical Value**: The course project combines theoretical demonstration and practical application value.

## Summary and Insights

RetinaScan-UNet demonstrates the application of the classic U-Net in medical image segmentation. Through careful training strategies and inference optimization, it achieves high-quality vessel segmentation with limited annotated data.

Insights for the medical image AI field:
- Data augmentation and cross-validation are key strategies for small dataset scenarios;
- Skip connections are crucial for preserving detailed information;
- TTA and morphological post-processing significantly improve segmentation quality;
- A reasonable patch strategy balances computational efficiency and model performance.

This open-source project provides valuable references for researchers and developers, promoting the application of deep learning in the field of medical diagnosis.