Section 01
CR² Framework Overview: A Cost-Risk Balancing Solution for Mobile Edge LLM Inference
CR² is a cost-aware and risk-controllable LLM inference routing framework for mobile edge scenarios. It adopts a two-stage device-edge architecture (device-side edge gating + edge-side utility selector) and integrates a conformal risk control calibration mechanism to achieve flexible trade-offs between latency, energy consumption, and accuracy, reducing deployment costs by 16.9% compared to baseline methods.