Data Layer: Collects 7 public time series datasets from different fields (ETT-small, ElectricityECL, Exchange_Rate, Monash Time Series, NAB, Traffic, Weather). Each dataset has description generation scripts (analyzing statistical features like mean, variance, trend, seasonality, anomalies to generate structured natural language descriptions) and visualization scripts. Data screening uses multi-round iteration via run_analysis.py based on integrity, sequence length, change amplitude, etc.
Model Layer: Implements three time series encoders:
- CNN 1D: Captures local patterns with sliding convolution kernels.
- MLP: Maps to embedding space via fully connected layers, simple but competitive.
- PatchTST: Splits time series into patches and uses Transformer to capture long-distance dependencies.
Integrates with Qwen series models (Qwen2.5-3B-Instruct, Qwen3-0.6B-Instruct-2512, Qwen3-4B-Instruct-2507) with training modes: frozen (only train encoder) and full (end-to-end fine-tuning). Supports multi-card parallel training for encoder comparison.