Zing Forum

Reading

Multi-Signal AI Receipt Forgery Detection System: An Anti-Fraud Solution Integrating Vision, OCR, and Anomaly Detection

This open-source project builds a multi-signal fusion AI system specifically designed to detect tampered receipt forgeries. By integrating EfficientNet image classification, U-Net pixel-level segmentation, OpenCV physical detection, and OCR logical verification, the system achieves an 81% AUC and 76% accuracy on the test set, significantly outperforming single-model approaches.

票据伪造检测多信号融合EfficientNetU-NetOCR异常检测计算机视觉文档取证反欺诈深度学习
Published 2026-04-25 19:01Recent activity 2026-04-25 19:21Estimated read 7 min
Multi-Signal AI Receipt Forgery Detection System: An Anti-Fraud Solution Integrating Vision, OCR, and Anomaly Detection
1

Section 01

【Main Floor/Introduction】Multi-Signal AI Receipt Forgery Detection System: An Anti-Fraud Solution Integrating Vision, OCR, and Anomaly Detection

In financial audit and reimbursement scenarios, receipt forgery (especially local micro-tampering) is difficult to detect with a single method. The open-source project forgery_detection proposes a multi-signal fusion AI system, integrating EfficientNet classification, U-Net segmentation, OpenCV physical detection, OCR logical verification, and anomaly detection techniques. It achieves an 81% AUC and 76% accuracy on the test set, significantly outperforming single models. This solution provides a robust approach for document forensics anti-fraud.

2

Section 02

Problem Background and Dataset Details

The challenge of receipt forgery lies in local micro-tampering (e.g., modifying amounts/dates), which is hard to identify with traditional single CNN methods. The project is built on the SROIE 2019 dataset, containing 1903 receipts (973 real, 930 forged), each with a pixel-level tampering mask annotation. The dataset is split into 1426 training, 286 validation, and 191 test samples, which is nearly balanced (1.05:1).

3

Section 03

Limitations of Single Models and Value of Multi-Signal Fusion

Experiments show obvious limitations of single models: the EfficientNet-B3 classifier only achieves an AUC of 0.67 and 53% accuracy; while the multi-signal fusion integration improves to an AUC of 0.81 and 76% accuracy (a 13.7 percentage point increase in AUC). Key finding: single models are insufficiently sensitive to local tampering, and multi-signal methods significantly enhance robustness.

4

Section 04

Detailed Multi-Signal Detection Architecture

The system integrates five complementary signals:

  1. Global Classification: EfficientNet-B3 binary classification (real vs. forged), input size 320×320, using TTA and class weighting to handle imbalance;
  2. Pixel Segmentation: U-Net (with EfficientNet-B3 encoder) outputs pixel-level tampering masks, loss function combines Focal Loss/Dice Loss/BCE;
  3. Physical Artifact Detection: OpenCV-based ELA (compression traces), edge detection, illumination consistency analysis, and Blob detection;
  4. OCR Logical Verification: Tesseract extracts text, verifying amount calculations, field completeness, and format compliance;
  5. Anomaly Detection: Isolation Forest based on OCR features, trained only on real data to generalize to new types of forgeries.
5

Section 05

Decision Fusion Engine and Tech Stack

Decision Fusion Strategy:

  • Strong Signal Coverage: High-confidence signals directly determine the result;
  • Consensus Voting: Weighted voting to integrate all signals;
  • Integrated Scoring: Map to a unified score, outputting three levels (clean/suspicious/forged) along with confidence, heatmap, and parsed fields. Technical Implementation: Uses PyTorch (deep learning), OpenCV (image processing), Tesseract (OCR), Scikit-learn (anomaly detection), and FastAPI (API service). Training was done on Google Colab, with Jupyter Notebooks provided (baseline model/final multi-signal model).
6

Section 06

Current Limitations and Future Improvement Directions

Limitations:

  • OCR Robustness: Multi-currency support needs improvement;
  • Synthetic Data Bias: Training data uses program-generated forgeries, which differ from real-world cases;
  • Rule Fusion: Currently based on rules, needs to be replaced with a learned meta-model. Future Directions: Collect real forged samples for training, enhance OCR multi-language/multi-currency support, and explore end-to-end deep learning fusion methods.
7

Section 07

Application Value and Project Summary

Application Value: Provides a usable detection tool, verifying the effectiveness of the "multi-signal fusion" paradigm in the document security field; indicates that forgery detection requires a multi-dimensional approach combining spatial localization, semantic understanding, and physical traces. Summary: The forgery_detection project solves the problem of local tampering detection by integrating deep learning and traditional CV techniques. Although there is room for improvement, the multi-signal approach provides a reference for similar problems and will play an important role in the anti-fraud field.