Zing Forum

Reading

DeepTrace: Browser-side AI Cyber Forensics System for Real-time Detection of Phishing Links, Scam Text, and AI-generated Content

A Chrome browser extension integrating multi-model ensembles, using XGBoost + LightGBM to detect phishing URLs, DeBERTa to identify scam text, RoBERTa to distinguish AI-generated content, and a meta-decision engine to comprehensively output risk ratings and explainable recommendations.

网络安全钓鱼检测AI生成内容识别浏览器扩展机器学习NLPXGBoostDeBERTaRoBERTaFastAPI
Published 2026-06-13 20:16Recent activity 2026-06-13 20:19Estimated read 6 min
DeepTrace: Browser-side AI Cyber Forensics System for Real-time Detection of Phishing Links, Scam Text, and AI-generated Content
1

Section 01

[Introduction] DeepTrace: Browser-side AI Cyber Forensics System for Real-time Detection of Phishing, Scams, and AI-generated Content

This article introduces DeepTrace, a Chrome browser extension developed and maintained by ZeehaanShah. It is an AI cyber forensics system integrating multi-model ensembles. Its core functions include real-time detection of phishing URLs, identification of scam text, distinction of AI-generated content, and output of risk ratings and explainable recommendations via a meta-decision engine. The system runs locally, balancing response speed and privacy protection. Released in 2025 and continuously maintained, its source code is available on GitHub (link: https://github.com/ZeehaanShah/DeepTrace-Cyber-Forensics).

2

Section 02

Background: The Linguistic Turn of Cyber Threats and the Birth of DeepTrace

In 2025, cyber attacks have shifted to "linguistic attacks"—exploiting human psychological vulnerabilities (e.g., deceiving judgment), which traditional security software struggles to handle. DeepTrace addresses this challenge by embedding forensic capabilities into the browser, providing real-time protection at the first point where users encounter threats. Unlike cloud-based services, it runs locally, ensuring both response speed and user privacy protection.

3

Section 03

System Architecture: Three Detection Modules + Meta-Decision Engine

DeepTrace adopts a modular design:

  1. Phishing URL Detection: XGBoost + LightGBM ensemble model (50/50 weight), extracts 52 features (length, character statistics, structural markers, etc.), performance: accuracy 94.81%, F1=0.9539, AUC-ROC=0.9885;
  2. Scam Text Detection: Fine-tuned DeBERTa-v3-xsmall model + 9 rule detectors + 25+ risk keywords, performance: accuracy 98%, F1=0.9801;
  3. AI-generated Content Detection: RoBERTa-based pre-trained model (Hello-SimpleAI/chatgpt-detector-roberta), trained on the HC3 dataset, accuracy ~97%;
  4. Meta-Decision Engine: Logistic regression fuses results from the three modules, outputs Normal/Phishing/AI-generated ratings, cross-validation accuracy 99.19% ±0.15%.
4

Section 04

Technical Implementation: FastAPI Backend and Chrome Extension Frontend

DeepTrace's tech stack:

  • Backend: Python3.11 + FastAPI framework, provides RESTful APIs, SlowAPI rate limiting (30 times/minute), Docker containerized deployment;
  • Frontend: Chrome extension (Manifest V3), including Service Worker (background processing), sidebar UI (result rendering), content scripts (text selection buttons);
  • Model Service: Backend caches models of the three modules, with a unified analysis endpoint at /api/v1/analyze.
5

Section 05

Test Cases and Performance Verification

The project provides typical test cases:

Input Expected Result Detection Module
http://paypa1-secure-login.xyz/verify 🚨 Phishing Link URL
https://www.google.com ✅ Normal URL
"Dear customer, your account is suspended. Verify your OTP immediately." 🚨 Scam Text Text
"Hi John, meeting at 3pm tomorrow. Bring the Q3 slides." ✅ Normal Text
AI-generated paragraph ⚠️ AI-generated AI Detection
Performance of each module: URL detection accuracy 94.81%, text detection 98%, AI-generated content detection 97% (HC3 benchmark), meta-engine cross-validation accuracy 99.19%.
6

Section 06

Project Significance and Insights

DeepTrace represents the development direction of cybersecurity tools: from passive defense to active identification, from cloud-based to local real-time analysis, from black-box to explainable output. For users: get professional security analysis within the browser without enterprise-level software; for developers: demonstrates methods for integrating multiple AI models and browser extensions; more importantly, it reflects the evolution of security threats in the AI era—the tool for detecting AI-generated content is itself a product of AI technology, opening a technological arms race for authenticity identification.