Zing Forum

Reading

Implementing a Neural Network from Scratch with NumPy: A Progressive Learning Project

This article introduces a neural network project implemented purely with NumPy, covering everything from a single neuron to full MNIST training, helping developers deeply understand the essence of backpropagation and gradient descent.

神经网络NumPy反向传播深度学习MNIST机器学习梯度下降从零实现
Published 2026-06-11 11:13Recent activity 2026-06-11 11:20Estimated read 5 min
Implementing a Neural Network from Scratch with NumPy: A Progressive Learning Project
1

Section 01

Introduction / Main Post: Implementing a Neural Network from Scratch with NumPy: A Progressive Learning Project

This article introduces a neural network project implemented purely with NumPy, covering everything from a single neuron to full MNIST training, helping developers deeply understand the essence of backpropagation and gradient descent.

3

Section 03

Project Background and Significance

Today, with deep learning frameworks like PyTorch, TensorFlow, and Keras being widely used, most developers can train models just by calling the .fit() method. However, this convenience often hides the underlying mathematical principles. When you face issues like gradient explosion, gradient vanishing, or training non-convergence, a lack of deep understanding of backpropagation and gradient descent often leaves you with no choice but to adjust hyperparameters blindly.

This project was created exactly to solve this problem. The author implemented a complete neural network from scratch using pure NumPy, without relying on any deep learning frameworks. Through six progressive files, readers can build a neural network by hand that achieves an accuracy of approximately 95.57% on the MNIST handwritten digit dataset.


4

Section 04

Project Structure: From Single Neuron to Complete Network

The core of the project lies in its progressive learning path. Each file is an independent lesson, adding new concepts based on the previous one:

5

Section 05

01_single_neuron.py — Basics of a Single Neuron

This is the starting point of the entire project. The code implements the most basic computational unit of a neural network: weighted sum and activation function. The mathematical expression is z = W·x + b, then the output is mapped to the (0,1) interval via the Sigmoid function σ(z) = 1/(1+e^(-z)). This step seems simple, but it is key to understanding how neural networks process input data.

6

Section 06

02_forward_pass.py — Forward Propagation

After mastering the single neuron, the project shows how to organize multiple neurons into layers and implement inter-layer connections via matrix multiplication. This reflects the core advantage of neural networks: matrix operations can process multiple samples in parallel, making large-scale data training possible. The code encapsulates forward propagation into a class structure, laying the foundation for subsequent extensions.

7

Section 07

03_loss_function.py — Loss Function

Training a neural network requires quantifying the gap between predictions and true values. The project uses Mean Squared Error (MSE) as the loss function, compressing multi-dimensional outputs into a single scalar value. This scalar is the optimization target—the process of network training is essentially the process of continuously minimizing this loss value.

8

Section 08

04a_backprop_single_neuron.py — Backpropagation Principles

This is the most educational part of the entire project. The code traces the application of the chain rule step by step: dL/dw = dL/da · da/dz · dz/dw. By explicitly calculating the gradients of each layer, readers can intuitively understand how errors propagate back from the output layer to the input layer. This transparency is often hidden by automatic differentiation mechanisms when using frameworks.