Reading

Complete Practical Project for MNIST Handwritten Digit Recognition Using CNN

This article introduces a Convolutional Neural Network (CNN) project built with TensorFlow/Keras for MNIST handwritten digit classification, including a complete model training and evaluation process as well as deployment of an interactive Streamlit web application.

CNNMNISTTensorFlowKeras深度学习图像分类Streamlit神经网络

Published 2026-06-03 22:15Recent activity 2026-06-03 22:19Estimated read 8 min

Section 01

Introduction: Complete Practical Project for MNIST Handwritten Digit Recognition Using CNN

This project was developed and open-sourced on GitHub by RabiyaMalik242 (Project link: https://github.com/RabiyaMalik242/MNIST-CNN-Project). It is a practical Convolutional Neural Network (CNN) project built with TensorFlow/Keras for MNIST handwritten digit classification. The project includes a complete model training and evaluation process as well as deployment of an interactive Streamlit web application. With a clear structure and detailed annotations, it is an excellent practice case for deep learning beginners.

Section 02

Dataset and Task Background

The MNIST dataset is a classic benchmark dataset in the machine learning field, containing 60,000 training samples and 10,000 test samples, with image size of 28×28 pixels (grayscale single channel). The dataset comes from real handwritten samples and has been normalized, with black background and white digits. Due to its moderate scale, balanced categories, and reasonable difficulty, it has become a standard validation dataset. This project targets multi-classification of 10 digit categories (0-9) in MNIST.

Section 03

CNN Architecture Design and Training Configuration

Network Architecture

Input → Conv2D (32 3×3 kernels, ReLU) → MaxPooling → Conv2D (64 3×3 kernels, ReLU) → MaxPooling → Flatten → Dense (128 neurons, ReLU) → Dropout (0.3) → Output (10 neurons, Softmax)

Key Choices

Activation functions: ReLU for hidden layers (alleviates gradient vanishing), Softmax for output layer (probability distribution)
Optimizer: Adam (adaptive learning rate)
Loss function: Categorical cross-entropy

Training Configuration

Number of training epochs: 15
Batch size: 32
Data preprocessing: Normalize images to 0-1 range
Validation strategy: Use validation set split to monitor training

Section 04

Model Evaluation and Performance

Evaluation Metrics

Includes multi-dimensional metrics such as accuracy, precision, recall, F1 score, confusion matrix, and classification report.

Visualization Tools

Provides visualizations like sample display, category distribution, preprocessing comparison, training curves, confusion matrix heatmap, and prediction examples.

Performance Results

After 15 epochs of training, the model achieves an accuracy of approximately 99% on the test set. The loss value is low, and the performance of the validation set is close to that of the training set, indicating that the model converges well and overfitting is properly controlled.

Section 05

Streamlit Interactive Web Application

One of the project's highlights is the integration of a Streamlit web application, which includes the following features:

Canvas drawing: Users can directly handwrite digits on the web canvas
Image upload: Supports uploading local digit images for recognition
Real-time prediction: Displays results immediately after submission
Probability distribution: Shows the model's prediction confidence for the 10 digits

The application preprocesses user input images into 28×28 grayscale format and feeds them into the trained CNN model to return results, which is suitable for teaching demonstrations and project reports.

Section 06

Analysis of CNN Feature Learning Mechanism

CNN realizes digit recognition through hierarchical feature learning:

Shallow features: Early convolutional layers learn edges (horizontal, vertical, diagonal) and simple textures
Middle features: Middle convolutional layers combine basic features to form digit outlines (e.g., the circle of 0, the vertical line of 1)
Deep features: Fully connected layers integrate spatial features for comprehensive judgment; Dropout layers prevent overfitting and enhance generalization ability

Section 07

Project Usage Guide and Optimization Directions

Usage Guide

Environment configuration: Clone the repository → Install dependencies (TensorFlow, NumPy, Streamlit, etc.)
Model training: Run model.ipynb in Jupyter Notebook
Launch the application: Execute streamlit run app.py and visit http://localhost:8501

Optimization Directions

Data augmentation: Random rotation, translation, scaling, etc.
Hyperparameter tuning: Learning rate, number of convolutional kernels, Dropout ratio, etc.
Architecture improvement: Add Batch Normalization, residual connections
Cloud deployment: AWS, HuggingFace Spaces, etc.
Canvas optimization: Stroke smoothing, center alignment, etc.

Section 08

Summary and Insights

This project fully demonstrates the machine learning engineering process from data preparation, model construction, training optimization to application deployment. For beginners, reproducing the project can help understand CNN principles, TensorFlow/Keras usage, model evaluation methods, and end-to-end development. As an introductory dataset, MNIST is simple but can lay the foundation for complex computer vision tasks. It is recommended that beginners first reproduce the project and then try to expand and optimize it to deepen their understanding of deep learning.