Section 01
Introduction: Complete Practice of Tiny LLM on a Single RTX 3090
This project implements the full lifecycle of a tiny Large Language Model (LLM) on a single NVIDIA RTX 3090 graphics card, covering key technologies such as model architecture design, data preprocessing, training loop, custom CUDA kernel development, and inference optimization. The core goal of the project is to prove that under limited consumer-grade hardware resources, it is still possible to deeply understand the details of the Transformer architecture and gain practical engineering experience.