Section 01
Introduction: SLICE—An SLO-Driven LLM Inference Scheduling Framework for Edge Computing
SLICE is an LLM inference scheduling framework specifically designed for edge computing scenarios. Its core goal is to address the differentiated Service Level Objective (SLO) requirements of latency-sensitive tasks (e.g., real-time dialogue) and throughput-prioritized tasks (e.g., batch document processing) in resource-constrained edge environments. The framework takes SLO as the core of scheduling decisions and optimizes resource utilization and service quality through strategies such as dynamic resource allocation and edge scenario adaptation.