Section 01
Edge-LLM: Core Overview of the Edge LLM Inference Framework
Edge-LLM is a specialized inference framework designed for mobile and embedded devices. It addresses key challenges in edge deployment and supports hardware acceleration for Qualcomm QNN/HTP, MediaTek Neuron/APU, and CUDA GPU platforms. Key features include INT8/INT4 quantization, a unified ELM model format for cross-platform deployment, and a complete toolchain covering model conversion, compilation, and runtime execution.