# LLMLogAnalyzer: Research on Large Language Model-based Log Anomaly Detection Using Prompt Engineering

> This article introduces a Java Spring Boot project that uses large language models (LLMs) and prompt engineering techniques for system log anomaly detection, comparing the effects of three prompt strategies: zero-shot, rule-driven, and template-aware.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-13T17:14:10.000Z
- 最近活动: 2026-06-13T17:22:20.158Z
- 热度: 163.9
- 关键词: 大语言模型, 提示工程, 日志异常检测, LLM, Prompt Engineering, BGL数据集, 系统运维, Java, Spring Boot, Qwen2.5
- 页面链接: https://www.zingnex.cn/en/forum/thread/llmloganalyzer-8bc9ecdd
- Canonical: https://www.zingnex.cn/forum/thread/llmloganalyzer-8bc9ecdd
- Markdown 来源: floors_fallback

---

## [Introduction] Core Overview of the LLMLogAnalyzer Project

LLMLogAnalyzer is a master's research project based on Java Spring Boot, aiming to explore how prompt engineering can enhance the performance of large language models (LLMs) in system log anomaly detection. The project uses the BGL supercomputer log dataset and compares three prompt strategies—zero-shot, rule-driven, and template-aware—to provide references for the application of LLMs in operation and maintenance (O&M) scenarios.

## Project Background and Dataset Introduction

System log anomaly detection is a core challenge in O&M. Traditional methods rely on manual rules or supervised learning, requiring extensive manual feature engineering. LLMs can understand log semantics and identify anomalies based on system impacts. The project uses the BGL dataset (IBM Blue Gene/L supercomputer logs with normal/abnormal labels), and the model needs to output JSON classification labels (0 for normal, 1 for abnormal).

## Comparison of Three Prompt Engineering Strategies

The project compares three prompt strategies:
1. Zero-shot prompt: No BGL-specific knowledge, classifies based on general anomaly indicators, avoiding over-reliance on keywords like ERROR;
2. Rule-driven prompt: Structured decision process (first check anomaly/normal indicators, then use system impact fallback rules), injects domain knowledge to reduce false positives;
3. Template-aware prompt: Provides examples of BGL log anomaly/normal patterns, injects the most domain knowledge, and has theoretically optimal performance.

## Technical Architecture and Model Deployment Plan

The project uses the Java Spring Boot framework, with core components including BglParser (log parsing), PromptGenerator (prompt templates), CallModelAi (LLM API calling), and EvaluationMetricsService (metric calculation). The tech stack includes Java17, MongoDB, Ollama (local LLM running), and the Qwen2.5 7B model. Advantages of local deployment: data privacy, cost control, low latency, and customizability.

## Multi-dimensional Evaluation Metric System

The project uses comprehensive metrics to evaluate the effectiveness of the strategies:
- Basic classification metrics: Accuracy, precision, recall, F1 score;
- Confusion matrix metrics: TP (true positive), TN (true negative), FP (false positive), FN (false negative);
- Additional metrics: Invalid response rate (proportion of non-JSON outputs), average response time. These cover classification performance, output quality, and inference efficiency.

## Key Findings and Practical Insights of the Project

Key insights:
1. The quality of prompt engineering may be more important than model selection;
2. Domain knowledge can be injected incrementally from general to structured to specific;
3. In O&M scenarios, precision (reducing false positives) is more important than recall;
4. Requiring the model to output JSON facilitates automation, but parsing robustness needs to be considered.

## Application Scenarios and Future Expansion Directions

Current application scenarios: Supercomputer log monitoring, distributed system anomaly detection, security auditing. Expansion directions: Multi-dataset validation, multi-model comparison, online learning for prompt updates, multi-classification expansion, root cause analysis.

## Project Summary and Outlook

LLMLogAnalyzer is a rigorously designed academic project that provides a reproducible experimental framework and evaluation methods. Insights for engineers: Prompt design needs to integrate domain knowledge, evaluation requires multi-dimensional metrics, and local deployment protects data privacy. As LLM capabilities improve, combining them with prompt engineering will play a greater role in O&M automation.
