Zing Forum

Reading

Empirical Study on Neural Network Depth and Representation Capability: Controlled Experiments with 1800 Models Reveal Parameters Are the Key

Through controlled experiments with 1800 Fashion-MNIST models, this study investigates the impact of depth and parameters on the representation capability of neural networks, finding that the number of parameters rather than depth is the core factor determining model performance.

神经网络深度学习表征能力模型深度参数规模Fashion-MNIST对照实验统计验证机器学习研究
Published 2026-06-15 16:42Recent activity 2026-06-15 16:50Estimated read 7 min
Empirical Study on Neural Network Depth and Representation Capability: Controlled Experiments with 1800 Models Reveal Parameters Are the Key
1

Section 01

Introduction to the Empirical Study on Neural Network Depth and Representation Capability

This study conducts controlled experiments with 1800 Fashion-MNIST models, and its core finding is that the number of parameters rather than depth is the key factor determining the representation capability of neural networks. This research is the first to separate the depth effect from the parameter scale effect, providing important references for deep learning model design.

2

Section 02

Research Background and Core Questions

Long-standing controversy in the deep learning field: Does increasing neural network depth enhance representation capability? Previous studies often confused depth with the number of parameters, leading to biased conclusions. This study hypothesizes that depth itself cannot increase representation capability, and parameters are the key, and verifies this through large-scale controlled experiments.

3

Section 03

Experimental Design and Methodology

A dual-mechanism controlled design is used to isolate variables:

  • Equal-parameter mechanism: Fix the total number of parameters, only change the depth (2/4/6/8/12/16 layers) to test the pure depth effect;
  • Fixed-width mechanism: Fix the number of neurons per layer; parameters naturally increase as depth increases, simulating real expansion behavior. Experimental scale: 1800 models (10 seeds per configuration), covering data corruption levels of 0.0/0.6/1.0; 900 runs for each of the equal-parameter and fixed-width mechanisms; the median number of epochs for early stopping strategy is about 35; data is complete and reliable.
4

Section 04

Core Findings and Statistical Verification

Results of the equal-parameter mechanism: When parameters are fixed, accuracy remains at 67-70% (flat trend), Spearman correlation coefficient r=-0.08 (not significant), Kruskal-Wallis test p=0.118 → depth itself has no significant impact; Results of the fixed-width mechanism: When parameters increase with depth, accuracy improves by +5.7%, Spearman correlation coefficient r=+0.64 (highly significant) → the depth effect is an interactive result of parameter growth; Key insight: Depth is a container for parameters rather than a source of capability. Over-dispersion of parameters (as in the equal-parameter mechanism) reduces the representation capability of each layer, and the depth_per_param indicator reveals this bottleneck.

5

Section 05

Statistical Methods and Model Interpretation

Since the data does not meet normality and homogeneity of variance, non-parametric statistics are used: Kruskal-Wallis test (inter-group differences), Dunn post-hoc test (Holm correction), rank-biserial correlation (effect size); Model comparison: OLS (negative depth coefficient), Lasso (eliminates depth variable), decision tree (depth importance ≈0), random forest + SHAP (parameters dominate) → all consistently indicate that parameters are the dominant factor in predicting performance.

6

Section 06

Implications for Practical Applications

Guidance for deep learning practice:

  1. Prioritize shallow models with sufficient parameters over deep networks with insufficient parameters;
  2. When interpreting depth expansion laws, focus on parameter efficiency rather than pure depth;
  3. Use depth_per_param as a diagnostic indicator for model design to avoid over-dispersion of parameters;
  4. A fully fine-tuned small model may outperform a large deep network with insufficient parameters.
7

Section 07

Technical Implementation and Open-Source Resources

Technology stack: Python ecosystem (NumPy/Pandas for data processing, Scikit-learn for modeling, SciPy for statistics, Matplotlib/Seaborn for visualization, SHAP for interpretability); Open-source resources: Preprocessed datasets, mechanism-specific experimental data, trained models, statistical test reports, feature importance analysis.

8

Section 08

Research Conclusion

Through large-scale controlled experiments with 1800 models, this study challenges the intuition of "deeper is better" with rigorous statistical evidence and emphasizes the importance of parameter scale and distribution efficiency. Implications for researchers and engineers: Instead of blindly increasing the number of layers, ensure each layer has sufficient parameters to unleash its representation potential.