Zing Forum

Reading

TFG IPC-MCP: An Economic Time Series Forecasting Framework Combining Foundation Models and MCP Protocol

The open-source project tfg-ipc-mcp explores combining time series foundation models such as Chronos-2, TimesFM, and TimeGPT with Model Context Protocol (MCP) signals for inflation forecasting. The study compares the performance of statistical models, deep learning models, and foundation models in predicting Spain's CPI (INE), global CPI (IMF), and Europe's HICP (Eurostat).

时间序列预测基础模型MCP通胀预测Chronos-2TimesFMTimeGPTARIMA深度学习经济预测
Published 2026-06-15 20:46Recent activity 2026-06-15 21:23Estimated read 10 min
TFG IPC-MCP: An Economic Time Series Forecasting Framework Combining Foundation Models and MCP Protocol
1

Section 01

TFG IPC-MCP Project Guide: Exploration of Inflation Forecasting by Combining Foundation Models and MCP Protocol

Project Core Information

Project Overview

tfg-ipc-mcp is an open-source end-to-end forecasting framework designed to systematically evaluate the performance of time series foundation models in inflation forecasting and explore the value of semantic signals from the Model Context Protocol (MCP). The project covers three economic time series: Spain's CPI (INE), global CPI (IMF), and Europe's HICP (Eurostat), with a test period from 2021 to 2024, using rolling origin backtesting and MASE as the main evaluation metric.

Core Research Questions

  1. Can time series foundation models outperform traditional statistical models in inflation forecasting tasks?
  2. Can MCP signals add extra value to predictions?
  3. Does this gain depend on specific data scenarios?
2

Section 02

Research Background and Core Questions

Time series forecasting is a core task in economics and finance, with traditional statistical models (e.g., ARIMA) and deep learning models (e.g., LSTM) widely used. In recent years, foundation models specifically for time series (such as Amazon Chronos-2, Google TimesFM, and Nixtla TimeGPT) have emerged.

Meanwhile, the Model Context Protocol (MCP) as an emerging technical standard allows models to access real-time data, documents, and tools via standardized interfaces, enhancing contextual understanding.

This project focuses on three core questions:

  • Can foundation models outperform traditional statistical models?
  • Can MCP signals improve prediction performance?
  • Does the gain vary across data scenarios?
3

Section 03

Experimental Design and Methods

Experimental Conditions

Four conditions are designed to evaluate the value of different information sources:

Condition Description
C0 Univariate forecasting — using only historical series data
C1_inst Adding institutional signals (Fed interest rates, EPU, Brent crude oil prices, etc.)
C1_mcp Adding MCP news signals (features extracted from GDELT news headlines by Claude)
C1_full Using both institutional signals and MCP signals

All exogenous signals are processed with shift+1 and standardized via StandardScaler before being used for Ridge regression correction.

Model Comparison Lineup

  • Statistical Models: ARIMA, SARIMA, SARIMAX, AutoARIMA (dynamic order selection)
  • Deep Learning Models: LSTM, N-BEATS, N-HiTS
  • Foundation Models: Chronos-2 (Amazon), TimesFM (Google), TimeGPT (Nixtla)

Evaluation Methods

Test period: 2021-2024, using rolling origin backtesting. MASE metrics are normalized against historical data from 2002-2020.

4

Section 04

Core Research Findings and Evidence

Prediction Accuracy Comparison (h=12 forecast horizon)

Series Best Statistical Model MASE Best Foundation Model MASE C1 Signal Effect
Spain CPI ARIMA 1.097 TimesFM C0 1.326 -3% (neutral)
Global CPI AutoARIMA 1.134 Chronos-2 C1_inst 0.976 -14% vs AutoARIMA
Europe HICP SARIMA 1.656 TimesFM C1_full 1.370 -17%

Key Conclusions

  1. Foundation model performance depends on the series: ARIMA leads in Spain CPI scenarios; foundation models show clear advantages in long forecast horizons (h≥3-6) for global/European series.
  2. Value of C1 signals varies by series: Global CPI (Chronos-2+C1_inst) reduces error by 14%; Europe HICP (TimesFM+C1_full) reduces error by17%; neutral effect for Spain CPI.
  3. Model family ranking: Chronos-2 (robust under global institutional signals) > TimesFM (best in Europe C1_full) > TimeGPT (relatively weaker).
  4. Importance of forecast horizon: Statistical models dominate short horizons (h=1); foundation models start competing in medium horizons (h=3-6) and lead in long horizons (h=12) for global/European series.
  5. AutoARIMA is a double-edged sword: Dynamic order selection benefits global series but degrades performance in long horizons for Spain/European series.
  6. Standardization is critical: Non-standardization leads to a 534% inflation in MAE; standardization is mandatory when processing heterogeneous signals.
5

Section 05

Technical Architecture and MCP Signal Flow

Project Structure

  • tfg-forecasting: Data science module, including ETL, EDA, model implementation (statistical/deep learning/foundation models), MCP pipeline, evaluation, etc.
  • tfg-arquitectura: Web platform module for result display and interaction.

Tech Stack

Python, Docker Compose, PostgreSQL+MongoDB, Jupyter Notebook, pytest.

MCP Signal Extraction Flow

  1. News Collection: Obtain news headlines from the GDELT database.
  2. Semantic Extraction: Use Claude to extract economic-related semantic features from news.
  3. Feature Engineering: Convert semantic information into structured time series features.
  4. Model Fusion: Input as exogenous variables with institutional signals into prediction models.
6

Section 06

Limitations and Future Directions

Current Limitations

  1. Signal History: Spain's MCP signals only start from 2021, limiting historical learning.
  2. Computational Resources: Foundation model inference requires significant resources.
  3. Interpretability: The black-box nature of foundation models limits economic theory interpretation.

Future Directions

  1. Longer Data: Integrate news archives with longer time spans.
  2. Multimodal Signals: Include social media, satellite data, etc.
  3. Real-Time Deployment: Build production-level real-time forecasting systems.
  4. Causal Inference: Explore from correlation to causal mechanisms.
7

Section 07

Practical Implications

For Data Scientists

Foundation models are not one-size-fits-all; their effectiveness depends on data characteristics and tasks. Model selection should consider sequence complexity, historical length, and exogenous signals.

For Economists

MCP protocol and LLM technology provide new tools for economic forecasting, but their complementarity with traditional methods needs to be evaluated.

For Technical Architects

The project demonstrates the application of MLOps practices (Docker, CI/CD, reproducible research) in the economic forecasting field.