# LLM-Finance-Framework: A Complete Experimental Framework for Quantitative Trading Backtesting with Large Language Models

> A systematic research framework for evaluating the performance of large language models (LLMs) in financial decision-making, supporting comparisons with traditional quantitative strategies, multi-level memory systems, and experimental analysis of five trading personalities.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-04T13:13:37.000Z
- 最近活动: 2026-06-04T13:21:25.386Z
- 热度: 150.9
- 关键词: LLM, 量化交易, 回测框架, 金融AI, 行为金融学, LangGraph, 机器学习, 投资策略
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-finance-framework
- Canonical: https://www.zingnex.cn/forum/thread/llm-finance-framework
- Markdown 来源: floors_fallback

---

## [Introduction] LLM-Finance-Framework: A Complete Experimental Framework for Quantitative Trading Backtesting with Large Language Models

LLM-Finance-Framework is an experimental framework developed by tns-research and released on GitHub on June 4, 2026. Its core goal is to systematically evaluate the performance of large language models in financial trading decision-making. The framework supports comparisons with traditional quantitative strategies, features a multi-level memory system, and allows experimental analysis of five trading personalities, providing a complete empirical platform for AI finance research.

## Project Background and Overview

### Original Author & Source
- Original Author/Maintainer: tns-research
- Source Platform: GitHub
- Original Link: https://github.com/tns-research/llm-finance-framework
- Release Date: June 4, 2026

### Project Overview
LLM-Finance-Framework is an empirical research framework specifically designed to evaluate the performance of large language models in financial trading decision-making. It provides rigorous methodological tools to systematically compare differences between AI agents and traditional quantitative strategies, and deeply analyze key dimensions such as memory adaptation, probability calibration, and behavioral patterns. It is not only a backtesting tool but also a complete research platform that can explore LLMs' learning abilities, decision biases, and human-like behavioral characteristics in real market environments.

## Core Trading Mechanisms and Personality Types

### Core Trading Actions
The framework designs a three-action decision space for LLMs:
- BUY: Establish a long position (+1.0), profit when the market rises
- HOLD: Hold cash, maintain an empty position (0.0), no profit or loss
- SELL: Establish a short position (-1.0), profit when the market falls

Each day, the LLM receives technical analysis indicators, can maintain a strategy log and express emotions, and supports dynamic target configuration (e.g., SPY, QQQ, AAPL, etc.). Due to the warm-up period for technical indicators, the system automatically skips the first approximately 40 trading days.

### Five Trading Personalities
The framework introduces five personalities to study the impact of behavioral frameworks on AI decision-making:
1. **Cautious**: Risk-averse, prioritizes capital preservation, tends to hold cash during high volatility
2. **Aggressive**: Bold and proactive, actively holds positions to pursue excess returns
3. **Balanced**: Systematic, balances risk and return (default benchmark)
4. **Momentum**: Trend-following, goes with the flow
5. **Contrarian**: Counter-cyclical, identifies opportunities from market overreactions

You can switch personalities by modifying the `ACTIVE_PERSONALITY` parameter in `src/config.py`.

## Experimental Process and Technical Indicator System

### Five-Stage Processing Flow
1. **Data Preparation**: Load historical data, calculate technical indicators such as RSI, MACD, and Stochastic Oscillator, support snapshot or real-time data refresh
2. **Prompt Engineering**: The layered prompt system includes a four-level memory structure: system prompts (rules/indicator definitions), raw market data (current + 20-day technical history), strategy logs (latest 10 decisions and explanations), memory blocks (weekly/monthly summaries), performance summary (comparison with benchmarks)
3. **LLM Decision-Making**: Generate BUY/HOLD/SELL decisions via providers like OpenRouter or local Claude Code, with confidence levels and explanations
4. **Backtesting Engine**: Simulate historical trading performance, track returns, risk indicators, and position management
5. **Analysis and Reporting**: Generate comprehensive reports including statistical validation, behavioral pattern analysis, and performance comparisons

### Two-Layer Technical Indicator System
- **Daily Historical Sequence**: Daily prompts include 20-day lagged indicator values (RSI, MACD histogram, etc.) for pattern analysis
- **Aggregated Memory Context**: Weekly/monthly/quarterly/annual memories contain aggregated statistics (averages, percentages, etc.) to use tokens efficiently
- **Current Day Analysis**: Real-time RSI, MACD, volatility, etc., for comprehensive decision-making
- **Multi-Timeframe Correlation**: Analyze signals between daily trends, weekly averages, and monthly patterns for complex decisions

## Research Value and Application Scenarios

This framework provides multi-dimensional value for AI finance research:
1. **Memory Adaptation Research**: Explore temporal learning in LLMs' sequential financial decision-making and evaluate the effectiveness of multi-level memory systems (daily/weekly/monthly/quarterly/annual summaries)
2. **Personality Impact Analysis**: Compare performance differences of the five personalities in different market environments and study the impact of behavioral frameworks on AI decisions
3. **Calibration Analysis**: Evaluate the alignment between AI prediction confidence and results, measure overconfidence/underconfidence patterns
4. **Behavioral Bias Detection**: Identify human-like trading biases in LLM decisions (e.g., loss aversion, anchoring effect, etc.)

## Highlights of Technical Architecture

Key highlights of the framework's code design:
- **PerformanceTracker**: A dedicated class to extract performance tracking logic
- **JournalManager**: Isolates strategy log management using a rolling window
- **TradeHistoryManager**: Centralizes handling of CSV formatting for trade history
- **DataFrame Optimization**: Replaces individual column assignments with pd.concat, eliminating 54 performance warnings and improving memory efficiency by 98%

## Conclusion and Significance

LLM-Finance-Framework is an important milestone in AI finance research, providing a reproducible experimental platform and standardized evaluation methodology.

For researchers: It opens a new window to explore AI trading behavior, cognitive biases, and adaptive learning;
For practitioners: It provides a rigorous tool to test the feasibility of LLM strategies;

As LLM capabilities improve, such systematic evaluation frameworks will become more important, helping to understand the real capability boundaries of AI in the financial field and avoid being misled by marketing rhetoric.
