Section 01
[Introduction] GAR: Core Introduction to the Carbon-Aware Routing Optimization Framework for LLM Inference
Google Research team proposes the GAR (Green-Aware Routing) framework, which incorporates carbon emissions into LLM inference routing decisions. It minimizes the CO₂ emissions per request while meeting the minimum accuracy threshold and p95 latency Service Level Objective (SLO), providing a theoretical foundation and practical solutions for green AI inference.