Zing Forum

Reading

PromeFuzz: A Knowledge-Driven Fuzz Testing Driver Generation Tool Based on Large Language Models

An ACM CCS 2025 research project that uses large language models to automatically generate fuzz testing drivers, improving software security testing efficiency

模糊测试大语言模型软件安全漏洞挖掘ACM CCS自动化测试Fuzzing
Published 2026-04-29 15:35Recent activity 2026-04-29 15:48Estimated read 6 min
PromeFuzz: A Knowledge-Driven Fuzz Testing Driver Generation Tool Based on Large Language Models
1

Section 01

[Introduction] PromeFuzz: Core Introduction to a Knowledge-Driven Fuzz Testing Tool Based on Large Language Models

PromeFuzz is a research project for ACM CCS 2025. Addressing the pain points of time-consuming driver writing and easy omission of key scenarios in fuzz testing, it leverages the knowledge capabilities of large language models (such as Claude, GPT-4) to automatically generate high-quality test drivers, significantly improving software security testing efficiency and lowering the technical threshold for high-quality fuzz testing.

2

Section 02

Project Background: Pain Points of Fuzz Testing Drivers and Core Ideas

Fuzz testing relies on high-quality drivers, but manual writing requires in-depth understanding of code interfaces, data formats, and constraints—it is time-consuming and prone to missing key scenarios. PromeFuzz introduces a knowledge-driven approach, enabling large language models to understand source code semantics (function signatures, types, control flow, etc.), and generate effective drivers based on their internalized programming and security knowledge, which is different from traditional template/rule-based methods.

3

Section 03

System Architecture: Automated Workflow from Source Code to Fuzz Testing

  1. Source Code Preprocessing and Knowledge Extraction: Obtain compilation configurations via Clang/LLVM, parse function signatures, structs, etc., and build a project knowledge base;
  2. LLM Driver Generation: Design prompts based on the principles of accuracy, security, and exploration; support models like Claude Sonnet/GPT-4o;
  3. Driver Selection Optimization: Evaluate candidate drivers for parameter completeness, error handling, etc., and compile binaries with ASan and coverage tools;
  4. Fuzz Testing Execution: Run drivers with libFuzzer, collect coverage, crash information, and corpus.
4

Section 04

Technical Highlights: Extended Support and Performance Optimization

  • Extended support for 27 open-source project configurations (network protocols, image processing, compression algorithms, etc.);
  • Built-in benchmark framework that supports JSONL manifests for defining test cases, facilitating tool comparison;
  • Python 3.11 optimization: fixed compatibility issues, accelerated critical paths with Numba JIT, resulting in ~30x speed improvement.
5

Section 05

Practical Application Scenarios: Multi-Domain Security Testing

  • Open-Source Auditing: Generate drivers for key libraries like curl and openssl to detect memory vulnerabilities;
  • CI/CD Integration: Automatically run fuzz tests after code changes to detect regression vulnerabilities early;
  • Security Research: Serve as a benchmark tool to evaluate the effectiveness of new fuzz testing algorithms.
6

Section 06

Installation and Usage Guide

  1. Install system dependencies: clang, llvm-dev, libclang-dev, cmake, bear;
  2. Create a Python virtual environment and install dependencies;
  3. Run setup.sh to build the C++ preprocessor;
  4. Configure Anthropic/OpenAI API keys;
  5. Prepare JSONL test manifest and run setup_and_run_all.sh to start the workflow.
7

Section 07

Limitations and Future Directions

Limitations: Dependence on compilation databases (may be difficult for complex projects), LLM API costs, random fluctuations in generation quality; Future Directions: Support local open-source LLMs, introduce coverage feedback to optimize drivers, enhance support for complex state APIs.

8

Section 08

Conclusion: The Value of PromeFuzz

PromeFuzz combines large language models with fuzz testing technology. Automated driver generation lowers the testing threshold, improves coverage and vulnerability detection capabilities, making it suitable for development teams, security researchers, and open-source maintainers.