Section 01
[Introduction] New Paradigm for Large Model Inference Evaluation: Shifting from Computing Power Competition to Energy Efficiency
Researchers propose that LLM inference should be viewed as an "energy-to-token production" process, introducing the Token Production Function framework. They call on the industry to report energy metrics such as joules per token and PUE-adjusted power in addition to accuracy when evaluating inference systems, to promote the sustainable development of AI.