Reading

Deconstructing the Neural Network Black Box: An Interpretability Exploration of a Sudoku Solver

Through a custom Sudoku solver project, the developer attempts to transform neural network weights from mysterious unknowns into observable and understandable values, achieving full transparency in the model's decision-making process.

神经网络可解释性AI卷积神经网络数独求解器机器学习权重审计特征可视化深度学习

Published 2026-06-16 07:45Recent activity 2026-06-16 07:48Estimated read 7 min

Section 01

【Introduction】Deconstructing the Neural Network Black Box: An Interpretability Exploration of a Sudoku Solver

This project was developed by Pat Snyder (GitHub project: sudoku-ai). Its core goal is to break the 'black box' perception of neural networks—by building a custom Sudoku solver, it transforms model weights into observable and understandable values, achieving full transparency in the decision-making process. The project uses a hybrid convolutional architecture, combined with interpretability mechanisms such as weight auditing and feature visualization, to explore the internal operating logic of neural networks. Through feedback-driven iterative optimization, it provides engineering practice references for interpretable AI.

Section 02

Project Background and Motivation: Breaking the Black Box Perception of Neural Networks

Neural networks have long been regarded as 'black boxes'—input and output are clear, but the intermediate processes are difficult to explain. This project attempts to break this boundary, with the core philosophy: neural network weights should be as transparent as the bitboard of a chess engine. By transforming raw data into visual metaphors, it traces the complete chain from 'ideas' to 'decisions'.

Section 03

Architecture Design: Hybrid Convolution and Multi-Perspective Analysis

The Lens Stack

It uses an Inception-style hybrid convolutional layer, with 6 geometric shape filters working in parallel (2×2, 3×3, 4×4, 2×6, 1×5, 5×1). Each size has 16 variants, totaling 96 feature channels, avoiding rule bias and observing the Sudoku board from multiple perspectives.

The Synthesizer

It compresses the 96 original perspectives into 32 combined channels, simulating the human cognitive path from details to the whole, and forcing the network to synthesize micro-patterns into high-level structural concepts.

Section 04

Interpretability Mechanisms: Three Methods to Make Weights Speak

Weight Auditing

After training, extract the floating-point matrix, isolate 'dead channels' (weights close to zero) and high-weight channels—similar to financial auditing, making weight contributions transparent.

Feature Map Snapshots

Export 9×9 grayscale snapshots to show what the first and second layers 'see' and 'prioritize' in decision-making, transforming abstract matrices into intuitive visual patterns.

Metaphor Layer

Innovatively translate mathematical matrices into strategic narratives, assigning semantic labels to weights (e.g., 'missing numbers in a row', 'grid constraints'), building a bridge between computer bytes and human understanding.

Section 05

Iterative Optimization: Feedback-Driven Architecture and Strategy Adjustments

Based on analytical insights, a feedback-driven loop is used for adjustments:

Regularization and Robustness

Introduce nn.Dropout2d to encourage the model to learn robust and generalizable features, avoiding overfitting.

Learning Rate Scheduling

Implement the ReduceLROnPlateau scheduler. When the validation loss fluctuates in the later stages, reduce the learning rate from 0.001 to 0.0001 to stabilize the optimizer steps.

Data Diversity

Increase the number of empty cells: enhance spatial learning
Introduce multi-solution puzzles: encourage temporal reasoning
Eliminate generation bias: remove pattern bias in data generation

Section 06

Technical Highlights and Future Exploration Directions

Technical Highlights

Training observations: Initial intelligence is low but learning is fast; validation loss improves in the 8th round but with slight overfitting; later fluctuations are due to excessively high learning rates.
Checkpoint mechanism: Save the state of each round, support resuming training from previous models, which is beneficial for long-term iteration.

Future Directions

Channel optimization: Identify the most influential channels to decide pruning or expansion
Feature selection: Pre-analyze the dataset, test new features and retire old ones
Difficulty generation: Eliminate generation bias and create more challenging puzzles

Section 07

Insights: Engineering Practice of Interpretable AI from the Sudoku Solver Perspective

The value of this project lies not only in solving Sudoku but also in providing a new way to think about neural networks—challenging the assumption that 'black boxes are inevitable' and proving that through carefully designed architecture and systematic analysis, we can glimpse the internal operations. For the field of interpretable AI, this bottom-up engineering practice is more inspiring than abstract theories, transforming 'interpretability' from a slogan into actionable practice.