Section 01
Introduction: A Study on Joint Evaluation of Activation Functions and Initialization Strategies
This article systematically analyzes the gradient flow dynamics, saturation phenomena, and optimization behaviors of four activation functions (ReLU, tanh, arctan, and softsign) under different Xavier initialization scales using an MLP implemented purely with NumPy. It reveals the importance of evaluating activation function selection in conjunction with initialization strategies, rather than relying solely on the final accuracy metric.