# Using Large Language Models for Automatic Log Template Extraction: Practice of ICL and Prefix Tuning Methods

> This article introduces an open-source project for automatic log template extraction based on large language models, supporting models like GPT-2, Incoder, T5, and BART. It implements two methods: In-Context Learning (ICL) and Prefix Tuning (PT), and provides a complete implementation of evaluation metrics.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-26T00:11:16.000Z
- 最近活动: 2026-05-26T00:19:27.014Z
- 热度: 154.9
- 关键词: 日志模板提取, 大语言模型, 上下文学习, 前缀微调, AIOps, 日志分析, GPT-2, T5, BART, 自然语言处理
- 页面链接: https://www.zingnex.cn/en/forum/thread/icl-9264714f
- Canonical: https://www.zingnex.cn/forum/thread/icl-9264714f
- Markdown 来源: floors_fallback

---

## [Introduction] Using Large Language Models for Automatic Log Template Extraction: ICL and Prefix Tuning Practice Project

This open-source project is maintained by KasraRasi and was released on GitHub on May 26, 2026. It aims to implement automatic log template extraction using Large Language Models (LLMs). The project supports mainstream models such as GPT-2, Incoder, T5, and BART, provides two core methods: In-Context Learning (ICL) and Prefix Tuning (PT), and implements a complete evaluation metric system, providing basic support for AIOps tasks like log analysis and anomaly detection.

## Project Background and Problem Definition

Modern distributed systems and microservice architectures generate massive unstructured logs that contain key information about system operations. Traditional rule-based or statistical log template extraction methods have problems of low accuracy and poor generalization ability. Log template extraction is the process of identifying constants (templates) and variables (parameters) in logs, which is the foundation for subsequent log analysis, anomaly detection, and root cause analysis.

## Core Technical Solutions: ICL and Prefix Tuning

### In-Context Learning (ICL)
No fine-tuning required; the model learns extraction patterns through examples in prompts, supporting GPT-2, Incoder, T5, and BART models.

### Prefix Tuning (PT)
A parameter-efficient fine-tuning method that freezes pre-trained model parameters and only trains input prefix vectors, reducing computational resource requirements, supporting T5 and BART models.

## Implementation Details and Evaluation System

#### Environment Dependencies
Python 3.6+, depends on libraries like Hugging Face Transformers, PyTorch, NLTK, Pandas, PEFT, etc.

#### Implementation Flow
- ICL: Load model → Construct examples → Generate templates → Save results → Automatic evaluation (command line example: `python icl.py gpt-2`)
- PT: Requires additional installation of dependencies like PEFT (command line example: `python pt.py t5`)

#### Evaluation Metrics
- Text similarity: Rouge-1/2/L, BLEU
- Specialized metrics: PA (Parsing Accuracy), PTA (Precise Template Accuracy), RTA (Relaxed Template Accuracy)

#### Dataset Support
Covers four types of datasets: system logs, distributed system logs, application logs, and server logs.

## Experimental Results and Application Value

#### Experimental Results
1. Larger models perform better; code-specific models (e.g., Incoder) excel in structured logs;
2. Prefix Tuning outperforms ICL but requires additional training costs;
3. PT converges quickly with a small number of samples, and LLMs have good generalization ability.

#### Application Value
- Operation and maintenance teams: Reduce manual analysis and quickly adapt to new log formats;
- Developers: Open-source for secondary development, supporting multiple model choices;
- Researchers: Provide benchmark implementations and references for comparative experiments.

## Limitations and Future Improvement Directions

#### Current Limitations
1. Dependent on LLMs; local deployment requires certain computational resources;
2. Limited ability to handle extremely long logs or highly customized formats;
3. Evaluation mainly uses English logs, with insufficient multilingual support.

#### Future Directions
1. Explore efficient fine-tuning methods like LoRA and Adapter;
2. Introduce log semantic understanding to improve template accuracy;
3. Support real-time processing of streaming logs;
4. Develop visualization tools to assist analysis.

## Summary and Insights

This project demonstrates the application value of LLMs in traditional log analysis tasks. It achieves high-quality log template extraction through ICL and PT, without requiring large amounts of labeled data or only needing a small amount of training. It provides engineers with runnable code and experimental processes, and establishes reproducible benchmarks for researchers. With the improvement of LLM capabilities and efficiency optimization, LLM-based log analysis is expected to become an important technical direction in AIOps.
