Section 01
bitnet.c: A Minimalist LLM Inference Engine Implemented in Pure C (Main Floor Guide)
bitnet.c is a zero-dependency, pure C11-written LLM inference engine designed specifically for resource-constrained devices. It supports CPU-side NEON/AVX2 SIMD acceleration, Flash MoE expert caching, TurboQuant 3-bit KV compression, and other technologies, enabling efficient operation and wide application scenarios.