Zing Forum

Reading

Implementing a Neural Network from Scratch with NumPy: Understanding the Essence of Backpropagation

This article deeply analyzes a feedforward neural network project implemented solely with NumPy, demonstrating the core mechanism of the backpropagation algorithm through the XOR problem to help readers understand the fundamental principles of deep learning.

神经网络NumPy反向传播XOR问题深度学习基础机器学习Python
Published 2026-05-26 08:10Recent activity 2026-05-26 08:18Estimated read 7 min
Implementing a Neural Network from Scratch with NumPy: Understanding the Essence of Backpropagation
1

Section 01

Introduction: Implementing a Neural Network from Scratch with NumPy to Understand the Essence of Backpropagation

This article introduces a project that builds a feedforward neural network from scratch using NumPy, taking the classic XOR problem as a case study to deeply analyze the core mechanism of the backpropagation algorithm and help readers understand the underlying principles of deep learning. The project aims to fill the gap in understanding the underlying mechanisms when using framework APIs, and show the essence of neural networks through a minimal implementation.

2

Section 02

Project Background and Motivation

Today, with frameworks like PyTorch and TensorFlow prevailing, developers often rely on high-level APIs but lack an intuitive understanding of the underlying mechanisms. This project builds a network from scratch using NumPy, allowing learners to see the mathematical principles behind the code. The XOR problem was chosen as a case study because it is a milestone in the development of neural networks: in 1969, Minsky proved that a single-layer perceptron could not solve it, and the emergence of multi-layer networks and backpropagation broke through this dilemma.

3

Section 03

Network Architecture Design

The project uses a three-layer design:

  • Input layer: 2 neurons, corresponding to the two binary inputs of XOR;
  • Hidden layer: 4 neurons (the minimal size to solve XOR), extracting non-linear features (without this layer, it would be a linear classifier and unable to handle XOR);
  • Output layer: 1 neuron, using the Sigmoid activation function to output a 0-1 probability value, representing the probability of class 1.
4

Section 04

Core Mechanisms: Activation Function and Backpropagation

Sigmoid Activation Function

Formula: σ(x) = 1/(1 + e^(-x)). Its role is to introduce non-linearity (allowing the network to learn complex decision boundaries) and output probability values, but it has the problem of gradient vanishing.

Backpropagation Algorithm

Core steps:

  1. Forward propagation: Input flows through each layer, calculating the predicted results and loss;
  2. Backpropagation: Propagate the error backward from the output layer, calculate the weight gradients using the chain rule, and update the weights in the opposite direction of the gradient. The project's network is trained through 10,000 iterations, with a complete forward and backpropagation executed each time to converge to the correct weights.
5

Section 05

Training Process and Result Analysis

Training loss trend:

  • Epoch 0: Loss = 0.4965 (close to random guess);
  • Epoch 1000: Loss = 0.4948 (optimization takes effect);
  • Epoch 9000: Loss = 0.0864 (basically converged). Prediction result verification:
    Input Predicted Output Expected Value
    [0,0] 0.1013 0
    [0,1] 0.9269 1
    [1,0] 0.9201 1
    [1,1] 0.0593 0
    All predictions are close to the expected values, proving that the network successfully learned the XOR logic.
6

Section 06

Practical Significance and Learning Value

Although this project is small, it contains the core principles of deep learning. Through it, learners can:

  1. Understand tensor operations (signal propagation of matrix multiplication in the network);
  2. Master gradient descent (the mathematical principle of weight update);
  3. Debug network behavior (diagnose training status by observing loss curves);
  4. Build intuition about the role of hyperparameters (learning rate, hidden layer size). It is an excellent starting point for those who want to deeply understand deep learning rather than just call APIs.
7

Section 07

Summary and Learning Suggestions

This project shows the essence of neural networks in a concise way: behind complex frameworks are simple mathematical operations. After understanding the basics, using advanced tools becomes more comfortable, and problems can be diagnosed better. Suggestions for beginners: First run the code to observe the results, then read the source code line by line, try modifying the network structure or hyperparameters, and observe changes in training effects—hands-on practice plus thinking is easier to build a deep understanding.