Reading

PHISHIELD: A Three-Tier AI-Driven Phishing Detection System

A comprehensive phishing attack detection solution combining rule engines, AI models, and browser extensions, achieving high-accuracy risk identification through a multi-layer scoring mechanism.

钓鱼检测网络安全AI安全浏览器扩展FastAPI机器学习威胁情报

Published 2026-06-02 17:13Recent activity 2026-06-02 17:18Estimated read 6 min

PHISHIELD: A Three-Tier AI-Driven Phishing Detection System

Section 01

PHISHIELD: Introduction to the Three-Tier AI-Driven Phishing Detection System

PHISHIELD is a comprehensive phishing attack detection solution combining rule engines, AI models, and browser extensions, achieving high-accuracy risk identification through a multi-layer scoring mechanism. This system aims to address the failure of traditional protection methods, providing end-to-end detection capabilities covering web dashboards, browser extensions, and a unified backend API.

Section 02

Background and Problems

Phishing attacks are one of the most common and harmful threats in today's cybersecurity field; over 90% of data breaches start with phishing emails or malicious links. Traditional protection relies on blacklists and rule matching, but the evolution of attackers' techniques has gradually rendered them ineffective (e.g., URL obfuscation, domain spoofing, etc.). As a four-week master's graduation project, PHISHIELD aims to build an end-to-end AI-driven phishing detection system.

Section 03

System Architecture Overview

PHISHIELD adopts a three-tier detection architecture:

Rule Engine Layer: Includes Google Safe Browsing API (40% weight), URLhaus blacklist (30%), WHOIS domain age analysis (15%), and heuristic rules (15%) for quick initial screening.
NLP/AI Layer: Uses the GPT-4o-mini model to analyze the semantics of email subjects/body or URLs, evaluating dimensions such as urgency, grammar quality, and link credibility, and returns structured scores.
HTTP Header Analysis Layer: Detects security configuration flaws of target websites (e.g., missing CSP/HSTS headers), abnormal server fingerprints, redirect chains, etc.

Section 04

Integrated Scoring Mechanism

The results from the three tiers are weighted and combined via an integrated scorer to generate a final risk score from 0 to 100:

0-30: Safe (clean)
31-70: Suspicious, recommend caution
71-100: Phishing, avoid access The system extracts the top 3 judgment reasons and persists scan records to a PostgreSQL database to support historical queries and feedback collection.

Section 05

Technology Stack and Core APIs

Technology Stack:

Backend: FastAPI, PostgreSQL (Neon), SQLAlchemy, Pydantic
Frontend: React18, TypeScript, Tailwind CSS
Browser Extension: Chrome Manifest V3, Service Worker, Content Script Core API Endpoints: POST /analyze/url (URL analysis), POST /analyze/email (email analysis), POST /analyze/email/upload (EML upload), GET /history (historical records), GET /health (health check).

Section 06

Deployment and Usage Scenarios

PHISHIELD supports multiple scenarios:

Individual Users: Chrome extension for real-time risk prompts, active URL scanning/email upload.
Enterprise Security: Web dashboard for centralized analysis, batch detection, historical auditing, and feedback optimization.
Developer Integration: Open REST API with JSON responses, easy to integrate into third-party applications.

Section 07

Project Significance and Insights

PHISHIELD demonstrates the modern cybersecurity protection paradigm:

Multi-layer defense: Combining rules + AI + protocol analysis to improve accuracy;
Human-machine collaboration: AI provides judgments, users make decisions, and feedback loops optimize the system;
Full-stack coverage: From backend API to frontend interface to extension, a complete experience;
Scalable architecture: Modular design supports adding new detection capabilities in the future. It provides a reference for security developers/researchers, showing the practice of combining traditional rules with modern AI.