Section 01
dotLLM: Core Guide to the .NET Native LLM Inference Engine
dotLLM is an LLM inference engine built entirely from scratch using C# and the .NET tech stack, without relying on llama.cpp or Python libraries. It supports multiple Transformer architectures, provides CPU SIMD optimization and CUDA GPU acceleration, and implements advanced features such as PagedAttention, speculative decoding, and constrained decoding. Led by .NET MVP Konrad Kokosa, it demonstrates the potential of .NET in high-performance computing scenarios.