章节 01
LLM-Engine: Pure CPU Local LLM Inference Desktop App (Core Overview)
LLM-Engine is a local LLM inference engine implemented from scratch in C++, supporting GGUF format models and running entirely on CPU (no GPU/cloud API needed). It features a complete Transformer inference stack (tokenizer, attention, KV cache, sampling) and a desktop chat interface built with Dear ImGui. Key advantages include privacy protection (local inference), no internet dependency, and educational value as a hands-on resource for understanding LLM inference mechanisms.