Reading

MyLLM: A Complete Open-Source Framework for Building Large Language Models from Scratch

MyLLM is an open-source project for building large language models from scratch, providing a complete pipeline from tokenizer training to RLHF reinforcement learning, helping developers deeply understand every detail of the Transformer architecture.

大语言模型TransformerPyTorch开源框架机器学习深度学习LLM训练RLHFLoRAGitHub

Published 2026-05-03 12:40Recent activity 2026-05-03 12:47Estimated read 5 min

Section 01

Introduction / Main Post: MyLLM: A Complete Open-Source Framework for Building Large Language Models from Scratch

Section 02

Project Background and Motivation

In the current large language model ecosystem, frameworks like Hugging Face, PyTorch Lightning, and TRL are quite mature, but they encapsulate a lot of low-level details for ease of use. For researchers and developers who want to truly understand the working principles of Transformers, these "black box" abstractions have instead become learning barriers.

The MyLLM project was born out of this need, with its core philosophy being "From Zero to Hero"—allowing users to truly understand the complete technology stack of modern large language models by implementing each component with their own hands. This project is not just a framework, but also a systematic learning path.

Section 03

Architecture Design: Transparent Technology Stack

MyLLM adopts a layered architecture design, breaking down the complex large model training process into clear and readable modules:

Section 04

Core Module Composition

model.py: Defines the core model structure in GPT/LLaMA style
api.py: Provides the LLM class, supporting functions like loading, text generation, batch generation, etc.
Configs/: Uses dataclass to define ModelConfig and GenerationConfig
Tokenizers/: Supports GPT2, LLaMA2, LLaMA3, and trainable tokenizers
Train/: Contains training engines like SFT, DPO, PPO
utils/: Loaders, samplers, weight mappers, and model registries

Section 05

Training Engine Architecture

The training module uses a plug-in design and supports multiple training paradigms:

SFTTrainer: Supervised Fine-Tuning Trainer (fully implemented)
DPOTrainer: Direct Preference Optimization (reserved in the framework)
PPOTrainer: Proximal Policy Optimization/RLHF (reserved in the framework)
Accelerator: Supports multiple acceleration schemes like single GPU, DDP, DeepSpeed, FSDP

Section 06

Learning Path: From Theory to Practice

MyLLM provides three progressive learning paths to meet the needs of users at different stages:

Section 07

1. Guided Notebooks (notebooks/)

Contains 21 carefully designed Jupyter Notebooks covering every step from word embedding to attention mechanisms, and then to complete model training. Each notebook is equipped with detailed theoretical explanations and runnable code examples.

Section 08

2. Independent Experiment Modules (Modules/)

Breaks down complex concepts into independent experimental units, each module focusing on one core concept such as positional encoding, multi-head attention, layer normalization, etc. This "master one concept at a time" design reduces the learning curve.

MyLLM: A Complete Open-Source Framework for Building Large Language Models from Scratch

Introduction / Main Post: MyLLM: A Complete Open-Source Framework for Building Large Language Models from Scratch

Project Background and Motivation

Architecture Design: Transparent Technology Stack

Core Module Composition

Training Engine Architecture

Learning Path: From Theory to Practice

1. Guided Notebooks (notebooks/)

2. Independent Experiment Modules (Modules/)

Continue Reading

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

LLM-assisted-analysis: A New Approach to Detecting Logical Vulnerabilities in Smart Contracts Using Large Language Models

Building Modern LLM from Scratch: A Tutorial-level Implementation of Llama-style Language Model