Section 01
Introduction / Main Floor: On-Device LLM Inference Test: Mobile Thermal Management is the Main Bottleneck, NPU Energy Efficiency Ratio Shines
Tests of Qwen 2.5 1.5B on Raspberry Pi NPU, Samsung S24 Ultra, iPhone 16 Pro, and RTX 4050 show that the iPhone's throughput halved after two iterations, the S24 encountered system-enforced frequency reduction, and the Hailo-10H NPU achieved an energy efficiency ratio comparable to RTX 4050 with power consumption below 2W