# Deepfake Image Detection System Based on Wavelet Transform and Deep Learning: A Study on Cross-Generator Generalization Ability

> This article introduces a hybrid architecture combining an RGB convolutional neural network and a wavelet transform branch for detecting AI-generated images and Deepfake content. Trained only on Stable Diffusion, the system achieves an accuracy of 95.4% on unseen GAN-generated images, demonstrating excellent cross-generator generalization ability.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-03T15:45:35.000Z
- 最近活动: 2026-06-03T15:51:42.676Z
- 热度: 148.9
- 关键词: Deepfake检测, 小波变换, 卷积神经网络, AI生成图像识别, 图像取证, Stable Diffusion, GAN检测
- 页面链接: https://www.zingnex.cn/en/forum/thread/deepfake-4e91b85b
- Canonical: https://www.zingnex.cn/forum/thread/deepfake-4e91b85b
- Markdown 来源: floors_fallback

---

## Introduction: Deepfake Detection System Based on Wavelet Transform and Deep Learning (Study on Cross-Generator Generalization Ability)

The DeepTrace project introduced in this article proposes a hybrid architecture combining an RGB convolutional neural network and a wavelet transform branch to detect AI-generated images and Deepfake content. Trained only on Stable Diffusion, the system achieves an accuracy of 95.4% on unseen GAN-generated images, demonstrating excellent cross-generator generalization ability. The project is open-source and provides a desktop tool, addressing the limitations of traditional image forensics methods.

## Research Background and Motivation: Limitations of Existing Methods and Insights from Frequency Domain Analysis

### Limitations of Existing Methods
Traditional Deepfake detection faces two major challenges: 1. Overfitting to specific generators, leading to a sharp drop in accuracy when switching architectures; 2. Black-box nature, making the decision process difficult to interpret.

### Frequency Domain Analysis Experiments
Test the classification effect of different feature combinations on GAN images:
| Feature Combination | Test AUC | Test F1 Score |
|---------|--------|-----------|
| HSV color space only | 0.580 | 0.588 |
| FFT frequency domain features only | 0.637 | 0.607 |
| Wavelet transform features only | 0.658 | 0.627 |
| FFT + Wavelet transform | 0.719 | 0.673 |
| HSV + FFT + Wavelet transform | 0.726 | 0.673 |

Key insight: Frequency domain information is the core signal for distinguishing generated images from real photos; AI-generated artifacts have unique patterns in the frequency domain.

## Method Architecture: WaveletHybridNet Dual-Branch Design and Training Configuration

### Dual-Branch Architecture
- **RGB Branch**: Extracts visual features such as color and texture, capturing semantic information in pixel space;
- **Wavelet Branch**: Uses Daubechies db4 wavelet with two-level decomposition to extract high-frequency subbands (LH, HL, HH) and capture AI-generated artifacts.

### Training Configuration
| Parameter | Value |
|-----|---|
| Optimizer | AdamW |
| Learning rate | 1×10⁻⁴ |
| Weight decay | 1×10⁻⁴ |
| Batch size | 8 |
| Input size | 128×128 |
| Wavelet type | db4 (2-level decomposition) |
| Validation strategy | 2-fold cross-validation |
| Early stopping patience | 3 epochs |

## Core Experimental Results: Verification of Cross-Generator Generalization Ability

### Experimental Design
Trained only on Stable Diffusion v1.5 images, tested on unseen GAN images.

### Test Results
| Metric | Value |
|-----|------|
| Overall accuracy | 95.4% |
| Real image recognition rate | 94% (2360/2507) |
| Fake image recognition rate | 97% (2411/2493) |
| False positives | 147 images |
| False negatives | 82 images |

### Result Interpretation
The model can detect GAN-generated images with high accuracy without having seen them, proving: 1. The frequency domain artifacts captured by the wavelet branch are transferable across generators; 2. Different generators leave similar fingerprints in the frequency domain; 3. The hybrid architecture separates general frequency domain traces from generator-specific features.

## Application Deployment: Implementation and Usage of DeepTrace Desktop Tool

### System Architecture
Browser/Desktop App → Spring Boot Backend → Python Subprocess Inference → Return Results

### Technology Stack
| Layer | Technology |
|-----|------|
| Machine Learning | PyTorch, PyWavelets (db4) |
| Traditional Baseline | scikit-learn SVM |
| Web Backend | Spring Boot 3.2, Java17 |
| Desktop App | Java Swing, System Tray API |

### Usage Steps
1. Run DeepTrace.exe, the D icon appears in the system tray;
2. Double-click to take a screenshot, drag the selection box to enclose the face;
3. Release the mouse to get REAL/FAKE result and confidence level;
4. Press ESC or right-click to cancel.

## Limitations and Future Directions: Current Restrictions and Expansion Plans

### Current Limitations
- Portrait/Avatar Optimization: Low reliability in detecting scenes and non-facial content;
- Single Training Data Source: Trained only on Stable Diffusion v1.5.

### Future Plans
Expand the training dataset to include generators such as DALL·E, Wukong, and additional GAN families to improve cross-generator accuracy and robustness.

## Conclusion: Project Significance and Open-Source Contribution

The DeepTrace project is an important advancement in the field of Deepfake detection. The hybrid architecture achieves high accuracy and cross-generator generalization ability, which has practical significance for building detection systems adapted to the generative AI ecosystem. The open-source implementation and desktop tool provide accessible tools for researchers and users, enhancing society's ability to identify AI-generated content.

Resource Links:
- GitHub Repository: https://github.com/firiusz123/Deepfake-detection
- Pre-trained Model: Download best_fold_0.pt from the Releases page
