Reading

Multi-Signal AI Receipt Forgery Detection System: An Anti-Fraud Solution Integrating Vision, OCR, and Anomaly Detection

This open-source project builds a multi-signal fusion AI system specifically designed to detect tampered receipt forgeries. By integrating EfficientNet image classification, U-Net pixel-level segmentation, OpenCV physical detection, and OCR logical verification, the system achieves an 81% AUC and 76% accuracy on the test set, significantly outperforming single-model approaches.

票据伪造检测多信号融合EfficientNetU-NetOCR异常检测计算机视觉文档取证反欺诈深度学习

Published 2026-04-25 19:01Recent activity 2026-04-25 19:21Estimated read 7 min

Multi-Signal AI Receipt Forgery Detection System: An Anti-Fraud Solution Integrating Vision, OCR, and Anomaly Detection

Section 01

【Main Floor/Introduction】Multi-Signal AI Receipt Forgery Detection System: An Anti-Fraud Solution Integrating Vision, OCR, and Anomaly Detection

In financial audit and reimbursement scenarios, receipt forgery (especially local micro-tampering) is difficult to detect with a single method. The open-source project forgery_detection proposes a multi-signal fusion AI system, integrating EfficientNet classification, U-Net segmentation, OpenCV physical detection, OCR logical verification, and anomaly detection techniques. It achieves an 81% AUC and 76% accuracy on the test set, significantly outperforming single models. This solution provides a robust approach for document forensics anti-fraud.

Section 02

Problem Background and Dataset Details

The challenge of receipt forgery lies in local micro-tampering (e.g., modifying amounts/dates), which is hard to identify with traditional single CNN methods. The project is built on the SROIE 2019 dataset, containing 1903 receipts (973 real, 930 forged), each with a pixel-level tampering mask annotation. The dataset is split into 1426 training, 286 validation, and 191 test samples, which is nearly balanced (1.05:1).

Section 03

Limitations of Single Models and Value of Multi-Signal Fusion

Experiments show obvious limitations of single models: the EfficientNet-B3 classifier only achieves an AUC of 0.67 and 53% accuracy; while the multi-signal fusion integration improves to an AUC of 0.81 and 76% accuracy (a 13.7 percentage point increase in AUC). Key finding: single models are insufficiently sensitive to local tampering, and multi-signal methods significantly enhance robustness.

Section 04

Detailed Multi-Signal Detection Architecture

The system integrates five complementary signals:

Global Classification: EfficientNet-B3 binary classification (real vs. forged), input size 320×320, using TTA and class weighting to handle imbalance;
Pixel Segmentation: U-Net (with EfficientNet-B3 encoder) outputs pixel-level tampering masks, loss function combines Focal Loss/Dice Loss/BCE;
Physical Artifact Detection: OpenCV-based ELA (compression traces), edge detection, illumination consistency analysis, and Blob detection;
OCR Logical Verification: Tesseract extracts text, verifying amount calculations, field completeness, and format compliance;
Anomaly Detection: Isolation Forest based on OCR features, trained only on real data to generalize to new types of forgeries.

Section 05

Decision Fusion Engine and Tech Stack

Decision Fusion Strategy:

Strong Signal Coverage: High-confidence signals directly determine the result;
Consensus Voting: Weighted voting to integrate all signals;
Integrated Scoring: Map to a unified score, outputting three levels (clean/suspicious/forged) along with confidence, heatmap, and parsed fields. Technical Implementation: Uses PyTorch (deep learning), OpenCV (image processing), Tesseract (OCR), Scikit-learn (anomaly detection), and FastAPI (API service). Training was done on Google Colab, with Jupyter Notebooks provided (baseline model/final multi-signal model).

Section 06

Current Limitations and Future Improvement Directions

Limitations:

OCR Robustness: Multi-currency support needs improvement;
Synthetic Data Bias: Training data uses program-generated forgeries, which differ from real-world cases;
Rule Fusion: Currently based on rules, needs to be replaced with a learned meta-model. Future Directions: Collect real forged samples for training, enhance OCR multi-language/multi-currency support, and explore end-to-end deep learning fusion methods.

Section 07

Application Value and Project Summary

Application Value: Provides a usable detection tool, verifying the effectiveness of the "multi-signal fusion" paradigm in the document security field; indicates that forgery detection requires a multi-dimensional approach combining spatial localization, semantic understanding, and physical traces. Summary: The forgery_detection project solves the problem of local tampering detection by integrating deep learning and traditional CV techniques. Although there is room for improvement, the multi-signal approach provides a reference for similar problems and will play an important role in the anti-fraud field.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

libmlxforge: An Embedded MLX LLM Inference Engine for Apple Silicon

libmlxforge is an embeddable MLX large language model (LLM) inference engine designed specifically for Apple Silicon. It provides a unified C ABI interface, supports calls from Node.js, Swift, and Rust, and features continuous batching, streaming output, JSON-constrained structured output, and embedding vector generation.

Recent activity 2026-06-09 17:23