Zing Forum

Reading

Practical Comparison Between Neural Networks and Traditional Machine Learning in Network Intrusion Detection: Not All Attacks Are Detected Equally

An empirical study on flow-level network intrusion detection comparing the detection performance of PyTorch Multilayer Perceptron (MLP) and traditional machine learning models (Logistic Regression, Random Forest) across different attack types, revealing security risk blind spots behind overall accuracy.

网络入侵检测机器学习PyTorch深度学习网络安全流量分析DDoS检测随机森林逻辑回归
Published 2026-05-12 12:55Recent activity 2026-05-12 12:58Estimated read 6 min
Practical Comparison Between Neural Networks and Traditional Machine Learning in Network Intrusion Detection: Not All Attacks Are Detected Equally
1

Section 01

[Introduction] Practical Comparison Between Neural Networks and Traditional Machine Learning in Network Intrusion Detection: Focus on Detection Blind Spots Specific to Attack Types

This study systematically compares the performance of PyTorch Multilayer Perceptron (MLP) and traditional machine learning models (Logistic Regression, Random Forest) in flow-level network intrusion detection via the open-source project Network_ids-neural-vs-traditional-attack-eval, focusing on analyzing detection rate differences across different attack types and revealing security risk blind spots behind overall accuracy.

2

Section 02

Research Background: Why Attack Type Specificity Matters

Traditional intrusion detection evaluations often focus on macro indicators like overall accuracy, but in security practice, some attack types are harder to detect (e.g., APT/internal threats disguise as normal traffic, while DDoS has obvious abnormal features). If a model's detection rate for a certain attack type is significantly low, even if overall indicators are good, there are still serious security blind spots. Core question of this study: Do the two types of models have systematic differences in detecting different attack types?

3

Section 03

Experimental Design and Data Preprocessing

Experimental Design: Adopted a binary classification framework (normal vs. attack), comparing three types of models: traditional ML baselines (Logistic Regression, Random Forest), PyTorch MLP; based on CIC-IDS2017/CSE-CIC-IDS2018 benchmark datasets (including various attack types like DDoS, brute force attacks, etc.).

Data Preprocessing: Flow-level feature extraction (packet size, duration, etc.), missing value handling, label encoding, standardization, training/test split; retained attack type labels for subsequent analysis.

4

Section 04

Key Findings: Significant Differences in Detection Rates Across Attack Types

Key Insights:

  1. Due to their unique traffic patterns, all models perform generally well on DDoS attacks;
  2. Web attacks (SQL injection, XSS) are hidden in normal HTTP traffic, making detection more difficult;
  3. Brute force attack features are between the two above;
  4. False negative rate (missed detection) is a key indicator in security scenarios, and the harm of missed detection is greater than false positives; The visualization report attack_type_detection.png intuitively shows the detection rate differences of each model across different attack types.
5

Section 05

The Trap of Overall Accuracy and Evaluation Recommendations

Over-reliance on overall accuracy easily masks risks (e.g., predicting all as normal in a scenario with 95% normal traffic also yields a 95% accuracy). Recommended evaluation practices:

  1. Hierarchical evaluation: Calculate indicators by attack type to identify weak points;
  2. Confusion matrix analysis: Focus on off-diagonal error patterns;
  3. Prioritize false negatives: The cost of missed detection is higher than false positives;
  4. Handle imbalanced data: Adopt sampling or weighting strategies.
6

Section 06

Technical Implementation and Limitations

Technical Implementation: Modular framework (src/data.py for data processing, src/models.py for PyTorch MLP definition, src/train.py for experiment scripts, src/evaluate.py for evaluation analysis, reports/ for result output), easy to extend.

Limitations: Datasets have synthetic traces and distribution shifts, some attack samples are insufficient, limited generalization ability, detection only without blocking; Future directions: Complex neural networks (CNN/LSTM), online learning, real device integration testing.

7

Section 07

Practical Value and Conclusion

Practical Value: Security teams can reproduce the comparison, verify model performance on their own data, identify detection deficiencies for specific attack types, balance complexity and performance, and establish internal evaluation systems.

Conclusion: Network security evaluation cannot only look at overall numbers; it needs to focus on attack type differences, false negative rates, and model limitations. Understanding the boundaries of capabilities is a core competency in security engineering.