Section 01
[Introduction] Analysis of pccx NPU's Bare-Metal LLM Execution on KV260
The pccx-FPGA-NPU-LLM-kv260 project is an open-source attempt to implement a dedicated Neural Processing Unit (NPU) via bare-metal on the AMD Kria KV260 development board, supporting efficient Large Language Model (LLM) inference. It covers key technologies such as W4A8 quantization, GEMM/GEMV data path design, and KV cache scheduling, providing a reference solution for edge AI deployment.