Zing Forum

Reading

Integrating LSTM and Large Language Models: An Intelligent Air Quality Prediction and Health Recommendation System

An end-to-end environmental AI system that combines LSTM time-series prediction with LLM explainable recommendations to achieve accurate PM2.5 prediction and generate health advice verified by hallucination auditing.

空气质量预测PM2.5LSTM大语言模型LLM时序预测幻觉检测环境AI可解释AI
Published 2026-05-16 11:53Recent activity 2026-05-16 12:00Estimated read 7 min
Integrating LSTM and Large Language Models: An Intelligent Air Quality Prediction and Health Recommendation System
1

Section 01

Introduction: Intelligent Air Quality Prediction System Integrating LSTM and LLM

The air-quality-llm project is an end-to-end environmental AI system that combines LSTM time-series prediction with LLM explainable recommendation capabilities to achieve accurate PM2.5 prediction and generate health advice verified by hallucination auditing, addressing the limitations of traditional air quality monitoring in trend prediction and personalized recommendations.

2

Section 02

Project Background: Air Pollution Challenges and the Need for AI Integration

Air pollution is a global public health challenge, and PM2.5 poses a serious threat to human health due to its tiny particle size. Traditional monitoring only provides real-time data and lacks the ability to predict future trends and offer personalized recommendations. Breakthroughs in deep learning for time-series prediction and LLMs for natural language understanding and generation have driven the exploration of integrating the two to build intelligent systems, and the air-quality-llm project is a response to this.

3

Section 03

System Architecture: Dual-Engine Driven Prediction and Recommendation Generation

The system core consists of two major modules:

  1. Time-Series Prediction Layer: Uses LSTM to capture the spatiotemporal patterns of air quality, introduces ISSA-LSTM to optimize hyperparameters, with AR(24) and persistence prediction as baselines; input features include PM2.5 and meteorological indicators, preprocessing includes missing value imputation and normalization, and uses 60 hours of historical data as the input window.
  2. Intelligent Recommendation Layer: Generates natural language health advice based on Qwen2.5-7B-Instruct; inputs include predicted PM2.5, AQI index and level, and key features; outputs include pollution causes, pollution source analysis, targeted recommendations, and confidence explanations.
4

Section 04

Key Innovation: Hallucination Auditing Framework Ensures Recommendation Credibility

To address the LLM hallucination problem, the project designed a hallucination auditing framework:

  • Consistency Verification: Cross-verify the consistency of PM2.5 and AQI conversion according to EPA standards, and the matching between AQI levels and numerical ranges;
  • Feature Anchoring Check: Ensure recommendations are based on input features;
  • Physical Range Validation: Exclude abnormal values;
  • Confidence Threshold: Label low-confidence recommendations with references or conservative strategies. Experimental results show a 0% hard hallucination rate and 100% AQI and category consistency.
5

Section 05

Model Performance Comparison: RMSE Performance Analysis of Various Methods

RMSE comparison on the UCI Air Quality Dataset:

Model RMSE
AR(24) Baseline 21.67
Persistence Prediction 22.02
Basic LSTM 24.53
ISSA-LSTM 28.78
The authors honestly reported that ISSA-LSTM performed slightly worse than the baseline on this dataset, attributing it to dataset characteristics, which reflects a rigorous scientific spirit.
6

Section 06

Application Scenarios: Practical Value for Multiple User Groups

The system covers multiple user scenarios:

  • Outdoor sports enthusiasts: Query predictions before planning activities to get advice on whether outdoor activities are suitable;
  • Sensitive groups: Children, the elderly, and patients with respiratory diseases receive targeted protection recommendations;
  • Schools and institutions: Adjust outdoor activity arrangements;
  • Urban planners: Long-term trend analysis supports policy formulation.
7

Section 07

Future Directions: Technology Expansion and Optimization Plans

Future improvement directions for the project include: introducing Transformer time-series prediction models, exploring RAG to improve LLM recommendation accuracy, developing uncertainty-aware AQI prediction, building a real-time deployment pipeline, and supporting multi-city prediction capabilities.

8

Section 08

Conclusion: Trustworthy Environmental Decision Support with Integrated AI Technologies

The air-quality-llm project demonstrates the possibility of integrating LSTM and LLM to build a trustworthy environmental decision-making system, and its hallucination auditing framework provides a reference paradigm for LLM applications in critical fields. With technological evolution, such integrated systems are expected to play a role in more fields, enabling AI to have computing, explanatory, and trustworthy capabilities.