Reading

Can Large Language Models Explain AI? Research on Explainable AI in Cybersecurity Scenarios

An empirical study exploring the ability of large language models (LLMs) to support explainable AI (XAI) in the cybersecurity domain, comparing the effectiveness differences between LLMs and traditional SHAP/LIME methods.

可解释AI大语言模型网络安全SHAPLIME机器学习幻觉问题入侵检测

Published 2026-05-18 19:15Recent activity 2026-05-18 19:18Estimated read 8 min

Can Large Language Models Explain AI? Research on Explainable AI in Cybersecurity Scenarios

Section 01

[Introduction] Exploring the Role of Large Language Models in Cybersecurity XAI

This article is an empirical study exploring the ability of large language models (LLMs) to support explainable AI (XAI) in the cybersecurity domain, comparing the effectiveness differences between LLMs and traditional SHAP/LIME methods. The core question is whether LLMs can reliably replace or enhance traditional XAI methods. Through experimental design and human evaluation, the study reveals the hallucination problem in LLM explanations and the key role of traditional XAI data, and provides best practices for using LLMs for XAI.

Section 02

Research Background: The Interpretability Dilemma of Black-Box Models

Machine learning models are widely used in the cybersecurity domain (e.g., intrusion detection, malware analysis), but most operate as "black boxes" with decision-making processes that are difficult to understand. In high-risk cybersecurity scenarios, interpretability is a core requirement for trust and action basis—security analysts need to understand the reasons behind model decisions, and predictions without explanations are hard to guide practical responses.

Section 03

Experimental Design: Multi-Dimensional Comparative Evaluation

Datasets and Scenarios

The experiment uses three cybersecurity datasets: Network_logs.csv (network traffic features and anomaly labels), cybersecurity_intrusion_data.csv (intrusion detection features and labels), and the KDD Cup dataset (a classic intrusion detection benchmark).

Comparative Methods

Pure LLM Explanation: Generate explanations directly from model inputs and prediction results, without feature importance information;
LLM-Enhanced Explanation: Generate explanations by combining SHAP/LIME feature importance;
Traditional XAI Output: Directly use SHAP/LIME's original feature importance charts and numerical values.

Evaluation Models

Two representative LLMs are selected: GPT-5 (closed-source) and GPT-OSS-20B (open-source).

Section 04

Key Findings and Human Evaluation Evidence

Key Finding: LLM Hallucination Issues

Pure LLM explanations have serious hallucinations:

Feature Importance Hallucination: Fabricate feature importance that is inconsistent with the actual model logic;
Semantic Bias: Infer based on the semantics of feature names (e.g., assuming "packet_size" means larger packets are suspicious), which is disconnected from the model's real patterns;
Lack of Consistency: Contradictory explanation logic across different samples.

After adding SHAP/LIME data, the coherence and alignment of explanations improve, and hallucinations are significantly reduced.

Human Evaluation Results

Preferences of 38 participants:

Enhanced GPT-5 explanations are the most popular (accurate and easy to understand);
GPT-OSS-20B performs competitively;
Original SHAP/LIME outputs are generally considered difficult to understand due to obscurity.

Implications from user needs: Accuracy and understandability need to be balanced, and the LLM-enhanced approach is a feasible path.

Section 05

Methodological Insights: Principles for Correct LLM Usage

Based on the research findings, we propose best practices for using LLMs for XAI:

Never use LLMs alone: Direct use of LLMs without traditional XAI feature importance data is prone to hallucinations;
LLMs are enhancers, not replacements: Translate the outputs of traditional XAI technologies into natural language and supplement context;
Maintain verifiability: Explanations should be traceable to specific feature importance data;
Consider audience background: Support multiple explanation granularities (technical experts prefer raw outputs, decision-makers prefer natural language).

Section 06

Practical Implications and Research Limitations

Implications for Cybersecurity Practice

Intrusion Detection Systems: LLM-enhanced XAI accelerates decision-making for security analysts;
Compliance Audits: Meet the requirements for decision interpretability and auditability;
Human-AI Collaboration: Improve the interface efficiency between AI and human analysts.

Limitations and Future Directions

Limited Model Scope: Only GPT-5 and GPT-OSS-20B were tested; other LLMs need verification;
Domain Specificity: Results are based on cybersecurity datasets, which may differ in other domains;
Long-Term Stability: LLM behavior changes with versions and needs continuous monitoring.

Future directions: Multi-modal explanations, XAI optimizing LLMs, standardized evaluation benchmarks.

Section 07

Conclusion: A Rational View of LLM's Role in XAI

LLMs are powerful but not omnipotent. In XAI scenarios, blind trust in LLMs easily leads to hallucinations; their real value is as an "explanation layer" for traditional XAI methods, converting technical outputs into understandable knowledge. It is recommended that organizations integrate LLMs with traditional methods like SHAP/LIME when using them, to balance accuracy and understandability.

Continue Reading

Keep going with more reads from the same topic.

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

SignalCut is an innovative web application that analyzes brands' visibility gaps in AI search, automatically generates evidence-based marketing strategies, and creates Hera video materials, helping early-stage brands gain a competitive edge in the AI answer engine era.

Recent activity 2026-04-26 11:27

AWS Open-Sources AI Search Citation Analysis System: Track Brand Exposure in AI Search Engines

An open-source project officially released by AWS, built on Amazon Bedrock, Step Functions, and React to form a complete serverless citation analysis system. It helps enterprises monitor their brand's citation status and competitive landscape in AI searches like ChatGPT, Perplexity, Gemini, and Claude.

Recent activity 2026-03-31 20:49

Next.js Application SEO and GEO Integrated Optimization Solution: Comprehensive Visibility from Search Engines to AI Assistants

This article delves into the stevewerme/seo-geo-nextjs project, an open-source tool designed specifically for Next.js applications to simultaneously optimize traditional search engine rankings (SEO) and generative engine visibility (GEO). It analyzes the project's core architecture, implementation mechanisms, practical application scenarios, and its strategic significance for developers and content creators.

Recent activity 2026-04-03 14:48

Baiyuan GEO Platform Technical White Paper: SaaS Engineering Practice for Generative Engine Optimization (GEO)

This article deeply analyzes the GEO Platform technical white paper developed by Baiyuan Technology, covering the seven-dimensional AI citation rate scoring algorithm, AXP shadow document delivery mechanism, Schema.org three-layer entity knowledge graph, and the hallucination automatic detection and repair closed-loop system, providing an engineering solution for brands to gain visibility in generative AI such as ChatGPT and Claude.

Recent activity 2026-04-18 22:54