Section 01
100-Day Inference Engineering Challenge: Guide to the Full-Stack Learning Path from CUDA to Multi-Cloud Scaling
This project is a systematic learning path built on Philip Kiely's Inference Engineering, aiming to help developers master the full-stack technologies of LLM inference engineering—from low-level CUDA kernel optimization to upper-layer cloud-native architecture design. Framed as a 100-day progressive learning journey, the project covers three core layers (single GPU optimization, multi-GPU collaboration, tools and observability) through runnable scripts and experiments, ultimately cultivating production-grade LLM deployment capabilities. Its features include practice orientation (all experiments are validated on DGX Spark clusters) and structured coverage, providing inference engineers with a complete knowledge system.