Zing Forum

Reading

Research on Vulnerability Handling Workflow Based on Role-Based Agent Architecture

This paper proposes a role-based agent workflow architecture for software security vulnerability handling, consisting of four core roles: planner, analyst, fixer, and verifier. Through CodeQL integration and multi-model collaboration, it achieves a 44% detection accuracy and 19% repair accuracy on 25 real C/C++ vulnerabilities.

agentic workflowvulnerability handlingsoftware securityrole-based architectureCodeQLmulti-agent systemLLM security
Published 2026-06-12 16:45Recent activity 2026-06-15 12:22Estimated read 6 min
Research on Vulnerability Handling Workflow Based on Role-Based Agent Architecture
1

Section 01

Research on Vulnerability Handling Workflow Based on Role-Based Agent Architecture (Introduction)

Original Author/Maintainer: Paper Author Team (arXiv) Source Platform: arXiv Publication Date: 2026-06-12 Core Viewpoints: This paper proposes a role-based agent workflow architecture for software security vulnerability handling, consisting of four core roles: planner, analyst, fixer, and verifier. Through CodeQL integration and multi-model collaboration, it achieves a 44% detection accuracy and 19% repair accuracy on 25 real C/C++ vulnerabilities. Original Link: http://arxiv.org/abs/2606.14261v1

2

Section 02

Background and Problems

Current software security methods based on Large Language Models (LLMs) mostly focus on isolated tasks (such as vulnerability detection or patch generation) and lack agent architecture designs that reflect industrial practices, leading to a significant gap with actual work requirements. Traditional single-task processing methods cannot effectively simulate the collaboration mode of security engineers in real scenarios, so a multi-role collaborative agent architecture needs to be designed to improve the application effect of LLMs.

3

Section 03

Core Method: Role-Based Agent Workflow

This study proposes a role-based agent workflow architecture, decomposed into four core roles:

  1. Planner: Formulates the overall vulnerability handling strategy, determines the analysis scope, repair priority, and resource allocation;
  2. Analyst: Conducts in-depth analysis of vulnerability types, severity, and root causes, integrates the CodeQL static analysis tool to enhance code analysis capabilities;
  3. Fixer: Generates repair solutions such as code patches and configuration adjustments based on analysis results;
  4. Verifier: Validates the effectiveness of repair solutions to ensure vulnerabilities are fixed without new issues.
4

Section 04

Experimental Design and Results

Experimental Configuration: Evaluated on 25 real C/C++ vulnerabilities, using models including nemotron-cascade-2:30b, qwen3-coder-next, gpt-oss:120b; Results: Detection accuracy of 44% (equivalent to GPT5.5), repair accuracy of 19%; CodeQL integration significantly improves analysis depth and accuracy.

5

Section 05

Practical Significance and Limitations

Significance:

  1. Realizes end-to-end automation of software security workflows, alleviating the shortage of security talents;
  2. Provides a new mode of human-machine collaboration, with roles corresponding to actual team functions;
  3. The architecture is scalable, supporting the addition of new roles or adjustment of responsibilities; Limitations:
  4. Repair accuracy (19%) needs to be improved;
  5. The evaluation dataset size is limited (25 vulnerabilities);
  6. Only supports C/C++ languages, and applicability to other languages needs to be verified.
6

Section 06

Conclusions and Future Directions

Conclusions: The role-based agent architecture opens up a new direction for the application of LLMs in the software security field. Decomposing tasks into professional roles improves interpretability and maintainability, demonstrating the potential of multi-agent collaboration; Future Directions:

  1. Reinforcement learning to optimize agent collaboration strategies;
  2. Integrate security vulnerability databases and best practice guidelines;
  3. Expand multi-language support;
  4. Develop real-time human-machine collaboration interfaces.