Section 01
[Introduction] IREE Optimization Experiments: Dynamic Shape Inference Optimization for LLMs like DeepSeek, Qwen, and Gemma
The PLC Lab at Chongqing University has open-sourced the iree-optimization project, which focuses on conducting dynamic shape optimization experiments for mainstream large language models (LLMs) such as DeepSeek, Qwen, and Gemma using the IREE compiler, exploring technical paths for efficiently running LLMs on edge devices. Based on the IREE framework, the project addresses the static compilation challenges caused by dynamic shapes in LLM inference, providing references for compiler optimization in LLM deployment.