Reading

LLM-Forecast: A New Time Series Forecasting Method Integrating ARIMA and Large Language Models

LLM-Forecast is a hybrid forecasting method that combines the traditional statistical model ARIMA with large language models (LLMs), aiming to leverage the strengths of both to achieve more accurate time series predictions.

时间序列预测ARIMA大语言模型混合模型预测分析机器学习统计模型数据科学

Published 2026-05-06 15:45Recent activity 2026-05-06 15:49Estimated read 9 min

Section 01

Introduction: LLM-Forecast—A New Time Series Forecasting Method Integrating ARIMA and Large Language Models

Time series forecasting is a classic problem in data science, widely applied in scenarios such as stock prices, energy demand, and weather forecasting. The traditional statistical method ARIMA has long dominated due to its interpretability and rigor, while the rise of large language models (LLMs) brings new possibilities. The open-source project LLM-Forecast proposes a hybrid method integrating the two, aiming to combine ARIMA's statistical rigor with LLM's pattern recognition capabilities to achieve more accurate time series predictions.

Section 02

Background: Two Paths in Time Series Forecasting—Comparison of ARIMA and LLM Pros and Cons

Traditional Statistical Method: Pros and Cons of ARIMA

ARIMA was proposed by Box and Jenkins in the 1970s, with its core being the combination of Autoregression (AR), Moving Average (MA), and Integration (I) to handle non-stationarity. Advantages: Strong mathematical interpretability, low data requirements, solid theoretical foundation, and robustness to outliers and noise. Limitations: Linear assumptions make it hard to capture non-linear patterns, reliance on manual feature engineering, cumulative errors in long-term forecasts, and difficulty integrating unstructured external information.

Large Language Models: Opportunities and Challenges of a New Paradigm

LLMs demonstrate strong pattern recognition and reasoning capabilities through massive text pre-training. Advantages: Capturing complex non-linear relationships, zero/few-shot learning, potential for multi-modal fusion, and capabilities in causal inference and scenario analysis. Challenges: Insufficient numerical precision, hallucination risks, high computational cost, and poor interpretability.

Section 03

Methodology: Fusion Strategy of LLM-Forecast

The core of LLM-Forecast is the complementarity between ARIMA and LLM: ARIMA captures structural components (trends, seasonality, autocorrelation), while LLM identifies complex pattern changes and the impact of external factors.

Architecture Design

Two-stage framework:

ARIMA Baseline Prediction: Stationarity test and differencing → automatic selection of optimal parameters → generation of baseline predictions and residual sequences.
LLM Residual Correction: Treat residuals as the target → construct prompts containing historical residuals, external events, and domain knowledge → LLM predicts correction amounts → add to the baseline.

Prompt Engineering

Key innovations: Encode numerical sequences into text, inject context information (e.g., holidays), describe patterns (trends/cycles/anomalies), and guide intermediate reasoning.

Hybrid Weight Optimization

Final prediction = α × ARIMA prediction + (1-α) × LLM corrected prediction, where α is optimized via validation set or dynamically adjusted based on the prediction horizon.

Section 04

Technical Implementation Key Points: Data Preprocessing, Model Selection, and Evaluation Metrics

Data Preprocessing

Normalization: Scale numerical values to the range handled by LLMs;
Sequence Segmentation: Split long sequences into overlapping windows;
Feature Enhancement: Add derived features such as moving averages and volatility.

Model Selection

Supports multiple LLM backends: GPT-4/GPT-3.5 (strong reasoning), Claude (long context), open-source models (Llama/Mistral, private deployment).

Evaluation Metrics

MAE/RMSE (accuracy);
MAPE (percentage error);
Directional accuracy (important in financial scenarios);
Confidence interval coverage (proportion of true values included in prediction intervals).

Section 05

Application Scenarios: Suitable Domains for LLM-Forecast

Financial Forecasting

Stock prices (combining technical indicators + news sentiment), foreign exchange rates (macro events), cryptocurrencies (highly volatile markets).

Energy and Utilities

Electricity demand (integrating weather forecasts + holidays), renewable energy generation (solar/wind output), energy prices (supply-demand + policies).

Supply Chain and Operations

Sales forecasting (promotions + market trends), inventory optimization (demand fluctuations), logistics planning (transportation time + demand hotspots).

Section 06

Experimental Results and Insights: Advantages and Considerations of the Hybrid Method

Advantages:

Accuracy improvement: MAE on multiple benchmark datasets is reduced by 15-30% compared to pure ARIMA;
Enhanced adaptability: Faster adaptation to structural changes (e.g., the pandemic);
Retained interpretability: ARIMA provides statistical foundation, while LLM offers pattern insights.

Considerations:

Computational overhead: Dual forecasting increases inference time;
Hyperparameter sensitivity: ARIMA parameters and prompts need tuning;
Data dependency: LLM correction effect is closely related to the quality of external information.

Section 07

Limitations and Future Directions: Improvement Areas for LLM-Forecast

Improvement areas for the current version:

End-to-end optimization: Explore joint training of ARIMA and LLM;
Uncertainty quantification: Better estimation of confidence intervals for hybrid predictions;
Multivariate expansion: Strengthen support for multivariate sequences;
Real-time optimization: Reduce inference latency to support real-time scenarios.

Section 08

Conclusion: Value and Practical Significance of the Hybrid Paradigm

LLM-Forecast represents an important trend in the field of time series forecasting: combining the rigor of traditional statistics with the flexibility of LLMs. This hybrid paradigm is not only applicable to forecasting but also provides ideas for other data science problems—embracing new technologies while not abandoning classic methods to find the optimal combination point. For practitioners, LLM-Forecast provides a deployable framework that shows how to integrate LLM capabilities into existing ARIMA pipelines without complete reconstruction. As large models' capabilities improve and costs decrease, hybrid methods will demonstrate value in more domains.

Continue Reading

Keep going with more reads from the same topic.

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

SignalCut is an innovative web application that analyzes brands' visibility gaps in AI search, automatically generates evidence-based marketing strategies, and creates Hera video materials, helping early-stage brands gain a competitive edge in the AI answer engine era.

Recent activity 2026-04-26 11:27

AWS Open-Sources AI Search Citation Analysis System: Track Brand Exposure in AI Search Engines

An open-source project officially released by AWS, built on Amazon Bedrock, Step Functions, and React to form a complete serverless citation analysis system. It helps enterprises monitor their brand's citation status and competitive landscape in AI searches like ChatGPT, Perplexity, Gemini, and Claude.

Recent activity 2026-03-31 20:49

Next.js Application SEO and GEO Integrated Optimization Solution: Comprehensive Visibility from Search Engines to AI Assistants

This article delves into the stevewerme/seo-geo-nextjs project, an open-source tool designed specifically for Next.js applications to simultaneously optimize traditional search engine rankings (SEO) and generative engine visibility (GEO). It analyzes the project's core architecture, implementation mechanisms, practical application scenarios, and its strategic significance for developers and content creators.

Recent activity 2026-04-03 14:48

Baiyuan GEO Platform Technical White Paper: SaaS Engineering Practice for Generative Engine Optimization (GEO)

This article deeply analyzes the GEO Platform technical white paper developed by Baiyuan Technology, covering the seven-dimensional AI citation rate scoring algorithm, AXP shadow document delivery mechanism, Schema.org three-layer entity knowledge graph, and the hallucination automatic detection and repair closed-loop system, providing an engineering solution for brands to gain visibility in generative AI such as ChatGPT and Claude.

Recent activity 2026-04-18 22:54