Zing Forum

Reading

FusionPhishGuard: Attention-Enhanced Multi-Branch Phishing Detection Framework for Mobile and Web Platforms

Introduces how the FusionPhishGuard framework achieves intelligent detection of cross-platform phishing attacks through multi-granularity tokenization, hybrid embedding of Transformer and LLM, and attention fusion mechanism.

钓鱼检测网络安全深度学习注意力机制多分支融合Transformer大语言模型BiLSTM移动安全
Published 2026-05-25 12:43Recent activity 2026-05-25 12:51Estimated read 6 min
FusionPhishGuard: Attention-Enhanced Multi-Branch Phishing Detection Framework for Mobile and Web Platforms
1

Section 01

FusionPhishGuard Framework Overview: Multi-Branch Attention-Enhanced Solution for Cross-Platform Phishing Detection

FusionPhishGuard is an attention-enhanced multi-branch deep learning framework proposed by Yashwanth Yallavula et al., designed for intelligent detection of cross-platform phishing attacks on mobile and Web platforms. It has been accepted by IEEE COMSNETS 2026 (SysAI Track). The framework addresses the complexity of modern phishing attacks through multi-granularity tokenization, hybrid embedding of Transformer and LLM, and attention fusion mechanism. The project is open-sourced on GitHub, with the original link: https://github.com/bytemonkk/FusionPhishGuard, released in May 2026.

2

Section 02

Evolution Trends and Detection Challenges of Modern Phishing Attacks

Modern phishing attacks show trends such as obfuscation (URL encoding, IDN, etc.), precise brand impersonation, diversified attack channels (mobile redirection), and LLM-generated personalized content. Traditional detection methods based on blacklists or simple feature matching are no longer sufficient, and there is an urgent need for intelligent detection solutions that can deeply understand URL semantics, capture hidden patterns, and adapt to cross-platform scenarios.

3

Section 03

FusionPhishGuard Framework Design: Core Concept of Multi-Perspective Fusion

The core of the framework is 'multi-perspective fusion', which includes seven parallel embedding branches: Word2Vec (vocabulary-level pattern recognition), FastText (character-level out-of-vocabulary word processing), BERT (deep context understanding), RoBERTa (optimized Transformer), MiniLM (lightweight context modeling), Qwen (domestic LLM semantic reasoning), and Falcon (large-scale language understanding). Each branch captures different dimensional features of URLs.

4

Section 04

FusionPhishGuard Architecture Details: Complete Process from Tokenization to Classification

The architecture is divided into four stages: 1. Multi-granularity tokenization: retains the URL hierarchical structure and subdivides vocabulary units; 2. Embedding extraction: seven branches generate their respective embedding representations; 3. Attention fusion: adaptive gating mechanism dynamically weights, combined with Squeeze-and-Excitation module to refine features; 4. Sequential modeling and classification: BiLSTM captures long-term dependencies, and finally binary classification determines whether it is a phishing link.

5

Section 05

Experimental Evaluation: Performance and Interpretability of FusionPhishGuard

Evaluated based on CatchPhish D2 (obfuscated URL samples) and PhishDump (large-scale cross-platform samples) datasets, it achieved 95.16% accuracy, 95.19% F1 score, and 0.9044 MCC on CatchPhish. The multi-branch fusion strategy is effective (e.g., MiniLM+Gemma reached 94.88% accuracy). The attention mechanism provides interpretability, helping analysts understand the model's decision logic.

6

Section 06

Technical Insights and Application Value: Significance of Hybrid Architecture and Cross-Platform Detection

Insights include the advantages of hybrid architecture (traditional embedding + Transformer + LLM), the interpretability value of attention mechanism, and the necessity of cross-platform detection. In terms of application, it provides the next-generation threat detection direction for enterprise security teams, the open-source implementation (Python3.10 + PyTorch2.0) supports community deployment, and demonstrates the migration application of NLP technology in the security field for researchers.

7

Section 07

Limitations and Future Directions: Resource Optimization and Multimodal Integration

Limitations include high computational resource requirements, focus only on URL-level detection, and challenges from adversarial samples. Future directions: explore model compression techniques (knowledge distillation, quantization), integrate multimodal features (screenshots, WHOIS data), and develop online learning mechanisms to adapt to attack evolution.