Zing Forum

Reading

PyTorch Handwritten Digit Recognition Practice: Deep Learning Principles Behind 98.16% Accuracy

A feedforward neural network project implemented from scratch using PyTorch, achieving 98.16% test accuracy on the MNIST dataset, fully demonstrating the engineering implementation of core concepts such as neural networks, backpropagation, and gradient descent.

PyTorchMNIST前馈神经网络手写数字识别反向传播梯度下降Batch NormalizationDropout正则化
Published 2026-06-14 05:42Recent activity 2026-06-14 05:52Estimated read 6 min
PyTorch Handwritten Digit Recognition Practice: Deep Learning Principles Behind 98.16% Accuracy
1

Section 01

PyTorch Handwritten Digit Recognition Practice: Project Guide for 98.16% Accuracy

This project is a feedforward neural network implemented from scratch using PyTorch, achieving 98.16% test accuracy on the MNIST handwritten digit recognition task. Without pre-trained models, it fully demonstrates the engineering implementation of core concepts such as neural networks, backpropagation, and gradient descent, making it highly valuable for teaching.

2

Section 02

Project Background and Overview

Project Overview: This is a deep learning project built purely from scratch without the shortcut of pre-trained models. The author manually implemented a 4-layer feedforward neural network under the PyTorch framework, achieving 98.16% test accuracy on MNIST. It is a highly valuable engineering practice for teaching, demonstrating the collaborative work of core components in modern deep learning.

3

Section 03

Network Architecture and Training Process

Network Architecture: 4-layer feedforward neural network

  • Input layer: 784 neurons, flattening 28×28 grayscale images
  • Hidden layer 1: 512 neurons, Linear(784→512)→BatchNorm→ReLU→Dropout(0.3)
  • Hidden layer 2: 256 neurons, Linear(512→256)→BatchNorm→ReLU→Dropout(0.2)
  • Hidden layer 3: 128 neurons, Linear(256→128)→BatchNorm→ReLU
  • Output layer: 10 neurons, Linear(128→10)→Softmax

Training Process: Adam optimizer was used. A drop in validation accuracy at the 6th epoch triggered the ReduceLROnPlateau learning rate scheduler, and the best validation accuracy of 98.24% was achieved at the 10th epoch.

4

Section 04

Performance Metrics and Digit-wise Analysis

Performance Metrics

Metric Value Status
Test Accuracy 98.16% ✅ Exceeds 98% target
Test Loss 0.0654
Best Validation Accuracy 98.24%
Training-Validation Gap 0.08% ✅ Near-zero overfitting
Total Parameters 568,970
Training Device CPU only ✅ No GPU required

Digit-wise Accuracy

Digit Accuracy Visual Difficulty Most Confused
1 99.21% 🟢 Easy
9 96.93% 🔴 Hardest 3,4
Digit 9 is the hardest because its visual appearance is similar to 3 and 4, and the confusion pattern aligns with human cognition.
5

Section 05

Core Concepts and Technical Highlights

Core Concepts Demonstrated: Universal approximator of neural networks, linear algebra-based forward propagation, backpropagation and automatic differentiation, gradient descent optimization, cross-entropy loss, regularization (Dropout/BatchNorm/Weight Decay)

Technical Highlights: Pure CPU training (lowers entry barrier), complete visualization (architecture diagram/training curve/confusion matrix), detailed experiment records, modular code structure.

6

Section 06

Learning Value and Conclusion

Learning Value: Beginners can understand the working principles of neural networks, changes in the training process, and the impact of regularization; experienced developers can learn to build complete reproducible experiments (documentation/code/performance analysis)

Conclusion: Building a neural network from scratch may seem 'outdated', but it allows for an in-depth understanding of basic principles. The 98.16% accuracy proves that classic feedforward networks, when paired with correct training techniques, still perform excellently on simple tasks.