Zing Forum

Reading

CRDA+VulnSec: How Small-Parameter Reasoning Large Models Achieve Multilingual Vulnerability Detection via Multi-Agent Collaboration

This article introduces a new code vulnerability detection scheme based on large language model agents. Through dual-source knowledge distillation, reasoning trajectory training, and iterative multi-hop RAG technology, it achieves performance that surpasses traditional static analysis tools while remaining lightweight.

漏洞检测大语言模型知识蒸馏RAG多智能体代码安全推理模型
Published 2026-05-18 23:06Recent activity 2026-05-18 23:18Estimated read 7 min
CRDA+VulnSec: How Small-Parameter Reasoning Large Models Achieve Multilingual Vulnerability Detection via Multi-Agent Collaboration
1

Section 01

Introduction: CRDA+VulnSec—Small-Parameter Reasoning Large Models Achieve Multilingual Vulnerability Detection via Multi-Agent Collaboration

This article introduces a new code vulnerability detection scheme based on large language model agents—CRDA+VulnSec. Adopting the design of "small-parameter reasoning model + multi-agent collaboration", this scheme uses dual-source knowledge distillation, reasoning trajectory training, and iterative multi-hop RAG technology. It achieves performance that surpasses traditional static analysis tools while remaining lightweight, and can effectively solve the problem of multilingual code vulnerability detection.

2

Section 02

Background: Dilemmas of Traditional Vulnerability Detection and Challenges in Large Model Applications

Software security vulnerability detection is a core challenge in software engineering. Traditional methods rely on static analysis tools (such as SonarQube, Fortify) and rule engines, but have limitations like high rule maintenance costs, difficulty in handling new types of vulnerabilities, high false positive rates, and insufficient multilingual support. In recent years, large language models have great potential in code understanding, but direct use faces problems such as large parameter size leading to high inference costs and lack of professionalism in the security field. How to achieve lightweight and improve professional capabilities has become a key issue.

3

Section 03

Methodology: CRDA+VulnSec Architecture and Core Technical Mechanisms

The core framework of this project is CRDA (Code Reasoning and Detection Agent) and the VulnSec system, adopting the concept of small-parameter model + multi-agent collaboration. The core technologies include:

  1. Dual-source knowledge distillation: Distill code understanding capabilities from large-scale general code models and vulnerability detection experience from professional security analysis models, fusing information from both to avoid bias;
  2. Reasoning trajectory training: Let the model learn the complete analysis trajectory of experts (code function understanding, suspicious pattern recognition, etc.) to form structured analytical thinking;
  3. Iterative multi-hop RAG: Retrieve the knowledge base multiple times during analysis, dynamically adjust strategies, and improve the detection rate of complex vulnerabilities.
4

Section 04

Multi-Agent Collaboration Architecture Design

The system adopts a multi-agent collaboration architecture, decomposing vulnerability detection into subtasks:

  • Code understanding agent: Parses code structure and identifies key execution paths;
  • Pattern matching agent: Quickly identifies known vulnerability patterns;
  • Deep reasoning agent: Performs logical analysis for complex scenarios;
  • Verification agent: Cross-validates results to reduce false positives. Agents collaborate via structured messages to improve accuracy, interpretability, and maintainability.
5

Section 05

Evidence: Experimental Verification and Performance

Experimental verification shows excellent performance of the scheme:

  • On standard datasets, the detection rate exceeds traditional static tools, and the false positive rate is significantly reduced; the parameter size is an order of magnitude smaller than general large models, and the professional vulnerability detection capability is stronger;
  • In real scenarios (Apache Spark codebase), 8 unrecognized security defects were independently discovered, including complex deep vulnerabilities involving cross-function calls, which were confirmed by experts to have practical value.
6

Section 06

Recommendations: Practical Insights for Developers

Practical insights for developers:

  1. Security detection does not have to rely on ultra-large-scale models; small-parameter models can reach professional levels through knowledge distillation and specialized training, making them suitable for resource-constrained teams;
  2. The multi-agent architecture provides a scalable solution for complex security tasks, and teams can customize and expand analysis agents;
  3. The iterative RAG mechanism combines external knowledge bases with model reasoning, which is suitable for the continuously updated security field.
7

Section 07

Conclusion and Outlook

CRDA+VulnSec represents a new direction for AI-driven code security analysis: through technological innovation, it achieves a professional, lightweight, and interpretable intelligent detection system, rather than simply replacing traditional tools. As software complexity increases, solutions that integrate expert knowledge and machine learning will play an important role in ensuring software supply chain security.