Zing Forum

Reading

Layered Beyond Moore's Law: Analysis of Price Evolution and Market Structure of Large Model Inference Services

The study reveals that between 2020 and 2026, token prices dropped by approximately 600 times, with the price half-life of economical models being only 1.10 years—far exceeding Moore's Law. Software innovation rather than hardware advancement is the main driver of cost reduction, and the market concentration HHI decreased from 4558 to 2086.

token pricingMoore's Lawmarket competitioninference costTFPHHIAI economics
Published 2026-03-30 23:28Recent activity 2026-03-31 11:22Estimated read 6 min
Layered Beyond Moore's Law: Analysis of Price Evolution and Market Structure of Large Model Inference Services
1

Section 01

[Introduction] Layered Beyond Moore's Law: Core Insights into Price and Market Structure of Large Model Inference Services

This study reveals that between 2020 and 2026, token prices for large model inference services dropped by approximately 600 times, with the price half-life of economical models being only 1.10 years—far exceeding Moore's Law. Software innovation (rather than hardware advancement) is the main driver of cost reduction. The market concentration HHI decreased from 4558 to 2086, showing a layered competitive structure: economical models see rapid price declines, while flagship models maintain brand premiums and pricing based on capability scarcity.

2

Section 02

[Background] New Paradigm of LLM Inference Service Pricing and Research Significance

The token-based pricing model for Large Language Model (LLM) inference services creates a completely new form of commodity, which is neither a fully informational product nor a traditional computing service. Understanding its price formation mechanism is crucial for enterprise cost optimization and the process of AI democratization. This study is the first systematic economic analysis of token pricing in the LLM inference market.

3

Section 03

[Research Methods] Data Sources and Analytical Framework

The study integrates data from OpenRouter API (318 models), Epoch AI records (3237 models), and 62 cross-validation milestone observations, covering the time span from 2020 to 2026. It uses methods such as Chow structural break test and Data Envelopment Analysis (DEA) for analysis.

4

Section 04

[Evidence 1] Layered Characteristics of Price Decline and Performance Beyond Moore's Law

Over the past six years, token prices have dropped by approximately 600 times, far exceeding Moore's Law expectations. The half-life of economical models is 1.10 years, and that of mid-tier models is 1.55 years—both faster than the Moore's Law benchmark. Flagship models show no regular price decline (R² of exponential fitting is only 0.031), and the average price of inference-specialized models is 31.5 times that of non-inference models (there is an 'inference premium').

5

Section 05

[Evidence 2] Market Structure Turning Point and Cost Drivers

The Chow test identifies May 2024 as the market turning point (F=5.74, p=0.005), where the market shifted from technology-driven to competition-driven. Cost decomposition shows that the residual of Total Factor Productivity (TFP) contributes 103.7% to cost reduction, while GPU hardware contributes -0.9% (hardware cost increases are offset by software efficiency). The Malmquist productivity index peaked at 4.11 in Q1-Q4 2024, with the shift of the technology frontier as the dominant factor.

6

Section 06

[Evidence 3] Market Competitive Structure and Sino-US Training Cost Differences

The market concentration HHI decreased from 4558 to 2086 (a drop of over 50%), but the flagship model segment still has high barriers. The elasticity between training cost and inference price is 0.432 (non-linear conversion). The training cost gap between China and the US is 63 times, attributed to architectural innovation rather than factor price differences.

7

Section 07

[Conclusion] Core Findings of Layered Beyond Moore's Law

The hypothesis of layered beyond Moore's Law holds: economical models see rapid price declines, while flagship models maintain premiums; software innovation is the core engine of cost reduction; market stratification will become the norm, and the pricing of leading models reflects brand and capability scarcity rather than pure cost-driven factors.

8

Section 08

[Policy and Outlook] Competitive Policy Challenges and Paths to AI Democratization

It is recommended to regulate token economics as an independent subfield of digital commodity pricing. The 600-fold price drop lays the foundation for AI democratization, but attention should be paid to benefit distribution to avoid the digital divide. International technology governance needs to emphasize differences in architectural innovation capabilities and re-evaluate the effectiveness of hardware export controls.