Section 01
[Introduction] KVCache-DSL: An MLIR-based Domain-Specific Language for KV Cache Optimization in Large Language Models
KVCache-DSL is an MLIR-based domain-specific language project aimed at addressing key performance issues in KV cache memory management during large language model (LLM) inference. By jointly analyzing and transforming the memory layout, access patterns, and vectorization of KV caches, this project provides an innovative solution for LLM inference optimization.