Zing Forum

Reading

LLM-TKESS: A Knowledge-Embedded Soft Sensing Framework for Industrial Process Tasks

LLM-TKESS is a text knowledge-embedded soft sensing framework based on large language models (LLMs). It aligns industrial process variables with the semantic space of LLMs through a two-stage training strategy, providing an innovative intelligent solution for industrial process monitoring.

大语言模型软测量工业过程知识嵌入参数高效微调LoRA时间序列预测智能制造
Published 2026-05-19 20:15Recent activity 2026-05-19 20:23Estimated read 6 min
LLM-TKESS: A Knowledge-Embedded Soft Sensing Framework for Industrial Process Tasks
1

Section 01

Introduction: LLM-TKESS—An Innovative Industrial Soft Sensing Framework Based on Large Language Models

LLM-TKESS is a knowledge-embedded soft sensing framework for industrial process tasks. It aligns industrial process variables with the semantic space of large language models (LLMs) through a two-stage training strategy, providing an innovative intelligent solution for industrial process monitoring. This framework integrates textual representation, knowledge embedding, and parameter-efficient fine-tuning techniques, aiming to address the limitations of traditional soft sensing technologies in complex nonlinear industrial processes.

2

Section 02

Background and Motivation: Challenges in Industrial Monitoring and the Introduction of LLMs

Industrial process monitoring is a core challenge in manufacturing and process industries. Traditional soft sensing relies on statistical methods and physical models, which perform poorly in the face of complex nonlinear processes. With the breakthroughs of large language models (LLMs) in the field of NLP, researchers have explored introducing LLM capabilities into industrial data analysis. LLM-TKESS is the result of this exploration; it converts industrial time-series data into textual representations, allowing LLMs to understand and process industrial data for more accurate soft sensing predictions.

3

Section 03

Core Architecture and Technical Methods: Two-Stage Training and Innovative Strategies

LLM-TKESS adopts a two-stage training strategy:

  1. Base Model Pre-training: The LLM-SS base model is trained on the GPT-2 architecture using autoregressive PEFT technology. The configuration includes a 768-dimensional hidden layer, 4 attention heads, LoRA (rank 4, scaling factor 32), a 6-layer GPT architecture, sequence length 96, etc., reducing computational resource requirements.
  2. Downstream Task Adaptation: Based on lightweight adapter fine-tuning, two paradigms are developed—LLM-DSS (data-driven) and LLM-PDSS (prompt and data hybrid embedding). The base model is frozen, and only a small number of adapter parameters are trained. In addition, the framework's innovations include: converting numerical time-series into textual representations that reflect physical meaning and temporal characteristics; LLM-PDSS injects domain knowledge through natural language prompts; using LoRA and adapter technologies to achieve parameter-efficient fine-tuning (trainable parameters <1%).
4

Section 04

Experimental Validation: Performance on the IndPensim Dataset

LLM-TKESS was validated on the IndPensim industrial simulation dataset (simulating the penicillin fermentation process, including variables such as temperature and pressure). The results show that compared to traditional LSTM and Transformer baseline models, LLM-TKESS significantly improves prediction accuracy, especially in modeling variables with complex nonlinear relationships.

5

Section 05

Application Prospects and Limitations: Applicable Scenarios and Current Challenges

Application Scenarios: Chemical process monitoring (key quality indicator prediction), energy system optimization (equipment status monitoring), intelligent manufacturing (online quality prediction), environmental monitoring (environmental indicator soft sensing). Limitations: Textual conversion introduces additional computational overhead (may affect low-latency scenarios); LLM-PDSS performance depends on the quality of prompt design; challenges in the interpretability of the model's decision-making process.

6

Section 06

Future Directions and Summary: Framework Value and Improvement Paths

Future Directions: Explore more efficient numerical-text encoding schemes, develop automatic prompt optimization techniques, and extend to multimodal industrial data (images, sounds, etc.). Summary: LLM-TKESS successfully introduces LLM capabilities into industrial process monitoring. Through innovative strategies, it achieves effective modeling of industrial time-series data and will play an important role in improving production efficiency, ensuring quality, and reducing costs. The open-source implementation provides a reference for related fields.