Zing Forum

Reading

Deepfake Image Detection System Based on Wavelet Transform and Deep Learning: A Study on Cross-Generator Generalization Ability

This article introduces a hybrid architecture combining an RGB convolutional neural network and a wavelet transform branch for detecting AI-generated images and Deepfake content. Trained only on Stable Diffusion, the system achieves an accuracy of 95.4% on unseen GAN-generated images, demonstrating excellent cross-generator generalization ability.

Deepfake检测小波变换卷积神经网络AI生成图像识别图像取证Stable DiffusionGAN检测
Published 2026-06-03 23:45Recent activity 2026-06-03 23:51Estimated read 8 min
Deepfake Image Detection System Based on Wavelet Transform and Deep Learning: A Study on Cross-Generator Generalization Ability
1

Section 01

Introduction: Deepfake Detection System Based on Wavelet Transform and Deep Learning (Study on Cross-Generator Generalization Ability)

The DeepTrace project introduced in this article proposes a hybrid architecture combining an RGB convolutional neural network and a wavelet transform branch to detect AI-generated images and Deepfake content. Trained only on Stable Diffusion, the system achieves an accuracy of 95.4% on unseen GAN-generated images, demonstrating excellent cross-generator generalization ability. The project is open-source and provides a desktop tool, addressing the limitations of traditional image forensics methods.

2

Section 02

Research Background and Motivation: Limitations of Existing Methods and Insights from Frequency Domain Analysis

Limitations of Existing Methods

Traditional Deepfake detection faces two major challenges: 1. Overfitting to specific generators, leading to a sharp drop in accuracy when switching architectures; 2. Black-box nature, making the decision process difficult to interpret.

Frequency Domain Analysis Experiments

Test the classification effect of different feature combinations on GAN images:

Feature Combination Test AUC Test F1 Score
HSV color space only 0.580 0.588
FFT frequency domain features only 0.637 0.607
Wavelet transform features only 0.658 0.627
FFT + Wavelet transform 0.719 0.673
HSV + FFT + Wavelet transform 0.726 0.673

Key insight: Frequency domain information is the core signal for distinguishing generated images from real photos; AI-generated artifacts have unique patterns in the frequency domain.

3

Section 03

Method Architecture: WaveletHybridNet Dual-Branch Design and Training Configuration

Dual-Branch Architecture

  • RGB Branch: Extracts visual features such as color and texture, capturing semantic information in pixel space;
  • Wavelet Branch: Uses Daubechies db4 wavelet with two-level decomposition to extract high-frequency subbands (LH, HL, HH) and capture AI-generated artifacts.

Training Configuration

Parameter Value
Optimizer AdamW
Learning rate 1×10⁻⁴
Weight decay 1×10⁻⁴
Batch size 8
Input size 128×128
Wavelet type db4 (2-level decomposition)
Validation strategy 2-fold cross-validation
Early stopping patience 3 epochs
4

Section 04

Core Experimental Results: Verification of Cross-Generator Generalization Ability

Experimental Design

Trained only on Stable Diffusion v1.5 images, tested on unseen GAN images.

Test Results

Metric Value
Overall accuracy 95.4%
Real image recognition rate 94% (2360/2507)
Fake image recognition rate 97% (2411/2493)
False positives 147 images
False negatives 82 images

Result Interpretation

The model can detect GAN-generated images with high accuracy without having seen them, proving: 1. The frequency domain artifacts captured by the wavelet branch are transferable across generators; 2. Different generators leave similar fingerprints in the frequency domain; 3. The hybrid architecture separates general frequency domain traces from generator-specific features.

5

Section 05

Application Deployment: Implementation and Usage of DeepTrace Desktop Tool

System Architecture

Browser/Desktop App → Spring Boot Backend → Python Subprocess Inference → Return Results

Technology Stack

Layer Technology
Machine Learning PyTorch, PyWavelets (db4)
Traditional Baseline scikit-learn SVM
Web Backend Spring Boot 3.2, Java17
Desktop App Java Swing, System Tray API

Usage Steps

  1. Run DeepTrace.exe, the D icon appears in the system tray;
  2. Double-click to take a screenshot, drag the selection box to enclose the face;
  3. Release the mouse to get REAL/FAKE result and confidence level;
  4. Press ESC or right-click to cancel.
6

Section 06

Limitations and Future Directions: Current Restrictions and Expansion Plans

Current Limitations

  • Portrait/Avatar Optimization: Low reliability in detecting scenes and non-facial content;
  • Single Training Data Source: Trained only on Stable Diffusion v1.5.

Future Plans

Expand the training dataset to include generators such as DALL·E, Wukong, and additional GAN families to improve cross-generator accuracy and robustness.

7

Section 07

Conclusion: Project Significance and Open-Source Contribution

The DeepTrace project is an important advancement in the field of Deepfake detection. The hybrid architecture achieves high accuracy and cross-generator generalization ability, which has practical significance for building detection systems adapted to the generative AI ecosystem. The open-source implementation and desktop tool provide accessible tools for researchers and users, enhancing society's ability to identify AI-generated content.

Resource Links: