Section 01
Introduction / Main Floor: In-depth Analysis of vLLM-XPU: Intel XPU Inference Performance Profiling and Visualization Tool
vllm-xpu-breakdown is a vLLM inference performance profiling tool specifically designed for Intel XPU. It can track and visualize the scheduling of operators across different backends (vllm-xpu-kernels, torch-xpu-ops, triton, cpu), helping developers optimize large model inference performance.