Section 01
ZoneTier-LLM: A New Hierarchical Flash Storage Management Scheme for Edge LLM Inference (Introduction)
ZoneTier-LLM is a two-tier zoned flash storage management prototype based on ConZone+ designed specifically for edge LLM inference. It addresses the challenges of limited resources on edge devices and the unique I/O characteristics of LLM inference (sequential read-only for weights, random read/write for KV cache) through strategies like media-aware data placement, heat-driven migration, and hybrid I/O scheduling, achieving storage optimization, improving inference performance, reducing hardware costs, and extending device lifespan.