Reading

BELMA: An Intelligent Contract Security Framework Combining Formal Verification and Large Language Models

BELMA is a two-layer smart contract vulnerability detection and automatic repair framework. The first layer uses bounded symbolic verification, while the second layer leverages fine-tuned LLMs to generate candidate patches, which are then re-verified in a closed-loop refinement cycle.

智能合约形式化验证大语言模型符号执行自动修复区块链安全LLM漏洞检测

Published 2026-04-29 01:10Recent activity 2026-04-29 01:17Estimated read 7 min

BELMA: An Intelligent Contract Security Framework Combining Formal Verification and Large Language Models

Section 01

BELMA Framework Guide: A Smart Contract Security Solution Combining Formal Verification and LLMs

BELMA is a two-layer smart contract vulnerability detection and automatic repair framework that innovatively combines the rigor of formal verification with the flexibility of Large Language Models (LLMs). The first layer performs vulnerability detection using word vector models, symbolic execution, and the SWC rule base; the second layer uses fine-tuned LLMs to generate candidate patches, and ensures patch correctness through a closed-loop refinement cycle (generate-verify-feedback-regenerate). This framework can not only handle known SWC vulnerabilities but also has the ability to explore zero-day vulnerabilities, providing a complete solution for smart contract security.

Section 02

Background: Dual Challenges in Smart Contract Security and the Birth of BELMA

Smart contracts are difficult to modify once deployed, and vulnerabilities can easily lead to huge losses. Traditional security audit methods have limitations: formal verification is rigorous but hard to handle complex contracts; static analysis tools have high false positive rates; LLMs understand semantics but lack mathematical guarantees. The BELMA framework, published in IEEE TDSC 2025, aims to resolve this contradiction by combining formal verification and LLMs to build a two-layer security detection and repair system.

Section 03

BELMA Architecture and Core Mechanisms of Automatic Repair

BELMA adopts a two-layer collaborative design:

Vulnerability Detection Layer: Integrates Word2Vec word vectors (to understand semantic patterns), symbolic execution engine (to explore path space), and SWC rule base (to identify known vulnerabilities), which can detect known and potential abnormal patterns.
Automatic Repair Layer: Uses fine-tuned LLMs to generate patches, introducing two key mechanisms:
- BiasScore: Analyzes historical repair patterns and adjusts prompts to reduce LLM's systematic bias;
- ErrorScore: Evaluates patch boundary cases through bounded verification (k=16) to avoid introducing new vulnerabilities. After detecting a vulnerability, the system passes structured context (AST nodes, data flow, etc.) to the repair module.

Section 04

Closed-Loop Refinement Cycle and Zero-Day Vulnerability Exploration Capability

BELMA uses a closed-loop refinement process: Candidate patches generated by LLMs are re-verified with bounded verification (k=16). If verification fails or ErrorScore exceeds the threshold, feedback is sent to the LLM for re-generation, with a maximum of 5 cycles. This solves the hallucination and lack of verification issues of pure LLMs. In addition, the beyond_swc module identifies abnormal patterns through an anomaly filter, uses LLM reasoning to generate hypotheses and verify them, and has the ability to explore zero-day vulnerabilities to address the evolving attack methods in the blockchain field.

Section 05

Engineering Implementation and Reproducibility Experiment Design

BELMA provides complete experiment reproduction scripts, covering baseline data for RQ1-RQ4, with a clear module structure (detection, repair, optimization, etc.). Configuration is centralized in belma_config.yaml to ensure reproducibility. The project includes comparative experiments with tools like Echidna and sFuzz, as well as sensitivity analyses such as complexity stratification and single-node ablation, reflecting the verifiability standards of academic research.

Section 06

Deployment Considerations and Current Limitations Analysis

Practical deployment needs to consider:

Computational cost: Symbolic execution and multiple LLM calls have significant overhead;
Latency issue: The closed-loop refinement cycle may extend repair time;
Platform support: Mainly optimized for Ethereum; the maturity of Fabric/EOS adapters needs to be improved. The project documentation provides DEPLOYMENT.md and FAILURE_TAXONOMY.md, which discuss failure modes and response strategies.

Section 07

Conclusion: New Paradigm of Formalization and AI Integration and Application Prospects

BELMA represents an important direction in the field of smart contract security: combining the rigor of formal methods with the flexibility of LLMs to achieve complementary advantages. It provides developers with a complete path from vulnerability discovery to repair; for researchers, it demonstrates the engineering integration of LLMs and formal verification. As LLM capabilities improve, this hybrid paradigm of 'AI generation + formal verification' is expected to be applied in more safety-critical fields.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

libmlxforge: An Embedded MLX LLM Inference Engine for Apple Silicon

libmlxforge is an embeddable MLX large language model (LLM) inference engine designed specifically for Apple Silicon. It provides a unified C ABI interface, supports calls from Node.js, Swift, and Rust, and features continuous batching, streaming output, JSON-constrained structured output, and embedding vector generation.

Recent activity 2026-06-09 17:23