# SocketAI Reproduction: Detecting Malicious npm Packages Using LLM

> Open-source reproduction of the ICSE 2025 paper SocketAI, implementing an npm package malicious code detection tool based on a three-stage LLM analysis workflow, supporting CodeQL pre-screening and full experimental data export.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-08T08:14:55.000Z
- 最近活动: 2026-04-08T08:21:04.554Z
- 热度: 148.9
- 关键词: npm, security, LLM, malware detection, CodeQL, static analysis, supply chain security
- 页面链接: https://www.zingnex.cn/en/forum/thread/socketai-llm-npm
- Canonical: https://www.zingnex.cn/forum/thread/socketai-llm-npm
- Markdown 来源: floors_fallback

---

## SocketAI Reproduction: Guide to the LLM-Powered Malicious npm Package Detection Tool

This article introduces the open-source reproduction project of the ICSE 2025 paper SocketAI. This tool implements malicious code detection for npm packages based on a three-stage LLM analysis workflow, supporting CodeQL pre-screening and full experimental data export. It aims to address the problem that traditional static analysis in the npm ecosystem struggles to handle new types of malicious attacks.

## Research Background and Motivation

As the world's largest software package repository (with over 2 million packages), npm brings convenience but also carries risks of malicious code injection (e.g., install scripts executing malicious commands, dependency tampering, obfuscation and hiding, etc.). Traditional detection methods (signature-based requiring frequent rule updates, behavior-based with high false positive rates) have limitations. The semantic understanding capability of LLM can make up for the shortcomings of traditional tools, distinguishing between code with similar syntax but different intentions (e.g., cleaning temporary files vs deleting system files).

## SocketAI's Core Methodology

SocketAI adopts a three-stage progressive analysis strategy: 1. Initial Malicious Assessment: LLM quickly screens files to evaluate the potential malicious level (considering obfuscation, network requests, sensitive paths, etc.); 2. Self-Review and Correction: The model reflects on initial judgments to reduce misjudgments caused by insufficient context or superficial similarity; 3. Final File-Level Determination: Integrates information to provide a clear malicious score and reasoning process for manual review.

## Highlights of Engineering Implementation

The reproduced version balances academic rigor and engineering practicality: 1. Optional CodeQL Pre-screening: Uses CodeQL to quickly identify risk patterns, reducing the amount of LLM analysis; 2. Observability: Exports data from each analysis step (prompt, response, token consumption, time cost, etc.); 3. Flexible Input: Supports local directories and tgz/tar/zip archives; 4. Batch Processing: Performs batch detection via JSONL/CSV lists, with errors in individual samples not interrupting the batch.

## Usage and Output Structure

Usage Workflow (based on Python + uv dependency management): Examples of commands for basic detection (without CodeQL) and enabling CodeQL pre-screening. Key parameters include input (input path), model (LLM model), use-codeql/no-codeql, threshold (determination threshold), and temperature (creativity level). The output structure is clear, including runtime metadata, package-level summary, file list, performance metrics, detailed results of each stage, and exported data (CSV format).

## Practical Significance and Outlook

For Security Teams: Complements existing detection systems (traditional rules capture known threats + LLM discovers new attacks); For Researchers: Full data export facilitates verifying new ideas (replacing models, adjusting strategies, etc.). This tool provides a practical research platform for npm security detection and is an example of transforming academic achievements into engineering practice.

## Conclusion

Software supply chain attacks are becoming increasingly frequent, making npm package security detection crucial. The SocketAI reproduction version demonstrates the application potential of LLM in the security field. Although not a panacea, it provides capabilities that traditional methods are hard to achieve in semantic understanding scenarios. The open-source reproduction not only verifies the original paper's method but also provides the community with a runnable and improvable baseline implementation.
