Zing Forum

Reading

From Zero Derivation to Practical Application: A Complete Open-Source Project of Generative AI Course Notes

A systematic set of Jupyter Notebooks supporting generative AI courses, covering complete derivations and visual implementations from basic probability theory to cutting-edge models like VAE, GAN, and Diffusion.

generative AIVAEGANDiffusionTransformermachine learningBayesian inferencedeep learningeducationalJupyter Notebook
Published 2026-05-20 07:12Recent activity 2026-05-20 07:18Estimated read 6 min
From Zero Derivation to Practical Application: A Complete Open-Source Project of Generative AI Course Notes
1

Section 01

[Introduction] From Zero Derivation to Practical Application: A Complete Open-Source Project of Generative AI Course Notes

Generative AI is sweeping the globe, but resources for understanding its underlying principles are scarce. This open-source course note project by GitHub user HAYDARKILIC, in the form of Jupyter Notebooks, combines theoretical foundations with Python practical implementation. It derives key formulas from first principles and visualizes them, filling the gap between application-level practice and abstract mathematics. The content covers probability theory, Bayesian inference, deep generative models (VAE/GAN/Diffusion), and Transformer-based large language models, suitable for beginners to progress step by step and for practitioners to fill knowledge gaps.

2

Section 02

Project Background and Overall Architecture

This note series originates from a systematic generative AI course and has completed 4 core chapters, with a knowledge system ranging from probability basics to Transformers. Each Notebook follows a three-stage structure: "Theoretical Derivation → Formula Implementation → Data Validation". The chapter progression is clear: Chapter 1 solidifies probability theory and decision theory; Chapter 2 delves into Bayesian inference; Chapter 3 focuses on deep generative models; Chapter 4 explains Transformers and large language models.

3

Section 03

Theoretical Foundations: Probability Theory and Bayesian Inference

Chapter 1 uses MNIST as an example to build mathematical intuition: Polynomial regression and curve fitting demonstrate overfitting/underfitting trade-offs; probability theory is used to derive Bayes' theorem, with medical diagnosis examples explaining base rate fallacy; MLE derivation and bias issues are covered; decision theory compares generative vs. discriminative models, etc. Chapter 2 explores Bayesian inference: The "number game" case introduces the size principle; Beta-binomial model shows conjugate priors and sequential updates; Dirichlet-multinomial model extends to multi-classification; mixture models compare different estimation methods.

4

Section 04

Panorama of Deep Generative Model Technologies

Chapter 3 starts with KL divergence derivation to lay the foundation of information theory. Latent space and manifold hypothesis are demonstrated via MNIST PCA; VAE covers ELBO derivation, reparameterization trick, and β-VAE regularization; GAN explains architecture design, optimal discriminator, and mode collapse; Diffusion models (DDPM) include forward/backward processes and SimpleUNet implementation; model comparison uses FID for quantitative evaluation and radar charts to compare performance across model dimensions.

5

Section 05

Core Analysis of Transformers and Large Language Models

Chapter 4 introduces the attention mechanism starting from the RNN gradient vanishing problem. It analyzes encoder-decoder architecture and information bottleneck; implements Bahdanau additive attention and scaled dot-product attention, emphasizing the necessity of √d_k scaling; multi-head attention decomposes subspaces to capture different information; compares sine/RoPE/ALiBi positional encoding schemes; explains feed-forward network evolution, layer normalization optimization, and Mini GPT training process.

6

Section 06

Learning Value and Practical Significance

The greatest value of this note series lies in the "zero-based derivation" concept: each formula is derived from first principles, accompanied by code and visualization. For practitioners, it provides systematic organization to cultivate mathematical intuition and modeling thinking; for educators, it demonstrates the design of interactive learning materials. Basic knowledge of linear algebra, calculus, probability theory, and Python skills are required.

7

Section 07

Conclusion: Respect Principles, Grasp the Essence

Generative AI is reshaping creativity and intelligence, requiring both application engineers and researchers who understand the underlying layers. This open-source note series is a precious resource for understanding the essence, reflecting the academic attitude of maintaining respect for principles and not forgetting the elegance of mathematics in an impetuous era.