Reading

AI-Based-Honeypot-Attack-Detection-System: A Network Attack Detection System Based on Honeypots and Machine Learning

A network attack detection system that combines Cowrie and Dionaea honeypot technologies with the Random Forest machine learning algorithm. It can extract features from honeypot logs and automatically classify brute-force attacks and interactive attacks.

蜜罐网络安全机器学习随机森林CowrieDionaea攻击检测威胁情报scikit-learn特征工程

Published 2026-06-17 00:45Recent activity 2026-06-17 00:53Estimated read 7 min

AI-Based-Honeypot-Attack-Detection-System: A Network Attack Detection System Based on Honeypots and Machine Learning

Section 01

[Introduction] Core Overview of AI-Based-Honeypot-Attack-Detection-System

Project Name: AI-Based-Honeypot-Attack-Detection-System Original Author: Sidd1007 Source: GitHub (Link: https://github.com/Sidd1007/AI-Based-Honeypot-Attack-Detection-System) Core Functions: Combines Cowrie and Dionaea honeypot technologies with the Random Forest machine learning algorithm, extracts features from honeypot logs and automatically classifies brute-force attacks and interactive attacks, providing intelligent analysis capabilities for network security defense.

Section 02

[Background] Basics of Honeypot Technology and Types Used in the Project

Definition of Honeypot

A security mechanism that lures attackers by setting up seemingly valuable targets, records and analyzes attack behaviors, and any access can be regarded as suspicious activity.

Honeypots Used in the Project

Cowrie: A medium-interactive SSH/Telnet honeypot that simulates a Unix environment, recording brute-force attacks, shell interactions, command execution, file downloads, etc.
Dionaea: A low-interactive honeypot that simulates vulnerable services like SMB and MSSQL, capturing malware payloads.

Section 03

[Methodology] System Workflow

System operation is divided into 6 phases:

Honeypot Deployment: Deploy Cowrie and Dionaea in an isolated environment, configure logs and monitoring.
Attack Capture: Record incoming connections, login attempts, command execution, file downloads, etc.
Feature Extraction: Extract features from logs (total number of attempts, number of unique usernames, number of failed/successful attempts, average time interval, number of command executions, session duration).
Dataset Generation: Organize features into CSV format.
Model Training: Use scikit-learn to train a Random Forest classifier (ensemble learning to improve accuracy and robustness).
Attack Prediction: The model classifies new activities as brute-force attacks or interactive attacks.

Section 04

[Technical Details] Feature Engineering and Model Performance

Feature Engineering

Feature design reflects attack patterns:

High-frequency login attempts with short intervals → Brute-force attack
Complex command sequences + long sessions → Interactive attack
Large number of unique usernames → Dictionary attack

Model Performance

The Random Forest classifier achieves an average cross-validation accuracy of 71.67% (considering attack diversity and log noise, this is a usable benchmark).

Visualization

Provides scripts for confusion matrix and feature importance visualization to help understand the model's decision logic.

Section 05

[Application Scenarios] Project Value and Use Cases

Security Operations Center (SOC): Automatically classify honeypot alerts, helping analysts prioritize high-risk interactive attacks.
Threat Intelligence Collection: Accumulate labeled data to train more accurate models, collect attacker behavior patterns and tool preferences.
Security Research and Education: Provide practical cases for students/beginners, covering honeypot deployment, log analysis, feature engineering, and machine learning applications.

Section 06

[Improvement Directions] Project Optimization Suggestions

Expand Honeypot Types: Introduce Conpot (industrial control systems), Glastopf (web applications), etc., to expand attack coverage.
Try Other Algorithms: Such as XGBoost, LightGBM, or deep learning models to improve accuracy.
Real-Time Detection: Expand from offline batch processing to a real-time stream processing architecture.
Attacker Profiling: Combine multi-honeypot data to build behavior profiles and correlation analysis.

Section 07

[Summary] Project Significance and Target Audience

AI-Based-Honeypot-Attack-Detection-System is a typical project combining traditional security technology with modern AI, demonstrating the process of converting honeypot data into machine learning features and the application of Random Forest. Target Audience:

Students who are new to network security machine learning
Security engineers who need to quickly build an attack detection prototype It provides a clear and runnable starting point for relevant personnel.