Zing Forum

Reading

LLM-Forecast: A New Time Series Forecasting Method Integrating ARIMA and Large Language Models

LLM-Forecast is a hybrid forecasting method that combines the traditional statistical model ARIMA with large language models (LLMs), aiming to leverage the strengths of both to achieve more accurate time series predictions.

时间序列预测ARIMA大语言模型混合模型预测分析机器学习统计模型数据科学
Published 2026-05-06 15:45Recent activity 2026-05-06 15:49Estimated read 9 min
LLM-Forecast: A New Time Series Forecasting Method Integrating ARIMA and Large Language Models
1

Section 01

Introduction: LLM-Forecast—A New Time Series Forecasting Method Integrating ARIMA and Large Language Models

Time series forecasting is a classic problem in data science, widely applied in scenarios such as stock prices, energy demand, and weather forecasting. The traditional statistical method ARIMA has long dominated due to its interpretability and rigor, while the rise of large language models (LLMs) brings new possibilities. The open-source project LLM-Forecast proposes a hybrid method integrating the two, aiming to combine ARIMA's statistical rigor with LLM's pattern recognition capabilities to achieve more accurate time series predictions.

2

Section 02

Background: Two Paths in Time Series Forecasting—Comparison of ARIMA and LLM Pros and Cons

Traditional Statistical Method: Pros and Cons of ARIMA

ARIMA was proposed by Box and Jenkins in the 1970s, with its core being the combination of Autoregression (AR), Moving Average (MA), and Integration (I) to handle non-stationarity. Advantages: Strong mathematical interpretability, low data requirements, solid theoretical foundation, and robustness to outliers and noise. Limitations: Linear assumptions make it hard to capture non-linear patterns, reliance on manual feature engineering, cumulative errors in long-term forecasts, and difficulty integrating unstructured external information.

Large Language Models: Opportunities and Challenges of a New Paradigm

LLMs demonstrate strong pattern recognition and reasoning capabilities through massive text pre-training. Advantages: Capturing complex non-linear relationships, zero/few-shot learning, potential for multi-modal fusion, and capabilities in causal inference and scenario analysis. Challenges: Insufficient numerical precision, hallucination risks, high computational cost, and poor interpretability.

3

Section 03

Methodology: Fusion Strategy of LLM-Forecast

The core of LLM-Forecast is the complementarity between ARIMA and LLM: ARIMA captures structural components (trends, seasonality, autocorrelation), while LLM identifies complex pattern changes and the impact of external factors.

Architecture Design

Two-stage framework:

  1. ARIMA Baseline Prediction: Stationarity test and differencing → automatic selection of optimal parameters → generation of baseline predictions and residual sequences.
  2. LLM Residual Correction: Treat residuals as the target → construct prompts containing historical residuals, external events, and domain knowledge → LLM predicts correction amounts → add to the baseline.

Prompt Engineering

Key innovations: Encode numerical sequences into text, inject context information (e.g., holidays), describe patterns (trends/cycles/anomalies), and guide intermediate reasoning.

Hybrid Weight Optimization

Final prediction = α × ARIMA prediction + (1-α) × LLM corrected prediction, where α is optimized via validation set or dynamically adjusted based on the prediction horizon.

4

Section 04

Technical Implementation Key Points: Data Preprocessing, Model Selection, and Evaluation Metrics

Data Preprocessing

  • Normalization: Scale numerical values to the range handled by LLMs;
  • Sequence Segmentation: Split long sequences into overlapping windows;
  • Feature Enhancement: Add derived features such as moving averages and volatility.

Model Selection

Supports multiple LLM backends: GPT-4/GPT-3.5 (strong reasoning), Claude (long context), open-source models (Llama/Mistral, private deployment).

Evaluation Metrics

  • MAE/RMSE (accuracy);
  • MAPE (percentage error);
  • Directional accuracy (important in financial scenarios);
  • Confidence interval coverage (proportion of true values included in prediction intervals).
5

Section 05

Application Scenarios: Suitable Domains for LLM-Forecast

Financial Forecasting

Stock prices (combining technical indicators + news sentiment), foreign exchange rates (macro events), cryptocurrencies (highly volatile markets).

Energy and Utilities

Electricity demand (integrating weather forecasts + holidays), renewable energy generation (solar/wind output), energy prices (supply-demand + policies).

Supply Chain and Operations

Sales forecasting (promotions + market trends), inventory optimization (demand fluctuations), logistics planning (transportation time + demand hotspots).

6

Section 06

Experimental Results and Insights: Advantages and Considerations of the Hybrid Method

Advantages:

  • Accuracy improvement: MAE on multiple benchmark datasets is reduced by 15-30% compared to pure ARIMA;
  • Enhanced adaptability: Faster adaptation to structural changes (e.g., the pandemic);
  • Retained interpretability: ARIMA provides statistical foundation, while LLM offers pattern insights.

Considerations:

  • Computational overhead: Dual forecasting increases inference time;
  • Hyperparameter sensitivity: ARIMA parameters and prompts need tuning;
  • Data dependency: LLM correction effect is closely related to the quality of external information.
7

Section 07

Limitations and Future Directions: Improvement Areas for LLM-Forecast

Improvement areas for the current version:

  • End-to-end optimization: Explore joint training of ARIMA and LLM;
  • Uncertainty quantification: Better estimation of confidence intervals for hybrid predictions;
  • Multivariate expansion: Strengthen support for multivariate sequences;
  • Real-time optimization: Reduce inference latency to support real-time scenarios.
8

Section 08

Conclusion: Value and Practical Significance of the Hybrid Paradigm

LLM-Forecast represents an important trend in the field of time series forecasting: combining the rigor of traditional statistics with the flexibility of LLMs. This hybrid paradigm is not only applicable to forecasting but also provides ideas for other data science problems—embracing new technologies while not abandoning classic methods to find the optimal combination point. For practitioners, LLM-Forecast provides a deployable framework that shows how to integrate LLM capabilities into existing ARIMA pipelines without complete reconstruction. As large models' capabilities improve and costs decrease, hybrid methods will demonstrate value in more domains.