# LLM Code Smells: Identifying Anti-Patterns and Best Practices in Large Language Model Integration

> This article introduces a large-scale empirical study on LLM code smells. The research team constructed a taxonomy containing 9 common anti-patterns and developed a static analysis tool called SpecDetect4LLM. Scanning over 170,000 source files from 692 open-source projects, the study found that 73.5% of systems have LLM code smells, and the tool's detection accuracy reaches 91.3%.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-21T19:10:08.000Z
- 最近活动: 2026-05-25T03:21:13.354Z
- 热度: 68.0
- 关键词: LLM, code smells, static analysis, software quality, prompt engineering, best practices, SpecDetect4LLM
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-a112e6ea
- Canonical: https://www.zingnex.cn/forum/thread/llm-a112e6ea
- Markdown 来源: floors_fallback

---

## 【Introduction】Key Points of the LLM Code Smells Study

This article is based on the research paper 'LLM Code Smells: A Taxonomy and Detection Approach' published on arXiv in May 2026. The key points are as follows:
1. Constructed a taxonomy of LLM code smells covering 9 common anti-patterns;
2. Developed the static analysis tool SpecDetect4LLM, supporting multiple languages such as Python, JS/TS, Java;
3. Scanned over 170,000 source files from 692 open-source projects and found that 73.5% of systems have LLM code smells;
4. The tool's detection accuracy reaches 91.3%, providing a practical method to improve the quality of LLM integration.
This study reveals the prevalent issues in LLM integration and is of great significance for ensuring the quality of AI-driven software systems.

## Background: Definition of LLM Code Smells and Industry Challenges

With the widespread application of LLMs in software systems, improper integration methods may impair system quality and maintainability. LLM code smells refer to programming practices that inappropriately use LLM reasoning capabilities in source code. Analogous to traditional code smells (which imply potential design issues), they are harder to detect due to the non-deterministic nature of LLMs. These smells have no obvious errors initially, but as the scale expands, they easily expose performance bottlenecks, security risks, or maintenance difficulties. Currently, best practices for LLM integration are not yet popularized, and systematic identification methods are urgently needed.

## Methodology: Taxonomy of LLM Code Smells and Detection Tool

### Nine Categories of LLM Code Smells
The research constructed a taxonomy covering the complete LLM call chain:
1. **Prompt Engineering Layer**: Hard-coded prompts, lack of prompt version control;
2. **Input Processing Layer**: Missing input cleaning, lack of context truncation strategy;
3. **Output Processing Layer**: Unverified output usage, ignoring output format inconsistency;
4. **Architecture Design Layer**: Single-point LLM dependency, lack of LLM call abstraction layer, no LLM performance monitoring.

### SpecDetect4LLM Tool
A static analysis tool based on AST analysis and data flow tracking. It can detect smells without executing code and supports mainstream programming languages.

## Empirical Study Results: Prevalence of LLM Code Smells and Tool Effectiveness

The study scanned 692 open-source projects (171,194 source files), and the results are as follows:
- **Prevalence**: 73.5% of systems have at least one LLM code smell;
- **Tool Performance**: SpecDetect4LLM has an accuracy of 91.3% (most reported issues are real) and a recall rate of 71.8% (supports preliminary screening);
- **Smell Distribution**: Problems at the prompt engineering layer are the most prominent (hard-coded prompts, lack of version control), and issues at the input processing layer are also common (reflecting insufficient security awareness).

## Practical Recommendations: Key Measures to Improve LLM Integration Quality

Based on the study findings, developers can take the following measures:
1. **Prompt Management Specifications**: Extract prompt texts, manage them with templates/config tools to implement version control and change tracking;
2. **Input/Output Validation Layer**: Apply length limits, filtering, and verification to inputs; perform structured parsing of outputs and prepare fallback strategies;
3. **LLM Abstraction Layer**: Encapsulate LLM call interfaces to reduce coupling, facilitating monitoring, vendor switching, and elastic modes;
4. **Continuous Monitoring System**: Track LLM call latency, token consumption, and error rates, set up alerts, and regularly review costs.

## Conclusion: Significance of LLM Code Smells Research and Future Directions

The research on LLM code smells provides important insights for the software industry in the AI transformation. The 73.5% prevalence data indicates insufficient popularization of best practices. The open-source SpecDetect4LLM tool provides detection means for the community, but the core lies in developers considering the specificity of LLM integration during the design phase and integrating quality awareness into daily practices. As the penetration rate of LLMs in critical systems increases, eliminating these smells will become a key link in ensuring system quality, and the code quality standards in the AI-driven era are being redefined.
