Zing Forum

Reading

Building an Employee Salary Prediction System Using Artificial Neural Networks and Optuna Automatic Hyperparameter Tuning

This article introduces a deep learning-based salary prediction project that implements a complete machine learning workflow from data cleaning to model deployment using Artificial Neural Networks (ANN) combined with Optuna automatic hyperparameter optimization.

深度学习人工神经网络薪资预测Optuna超参数优化回归分析TensorFlow机器学习工程
Published 2026-05-12 19:55Recent activity 2026-05-12 19:59Estimated read 8 min
Building an Employee Salary Prediction System Using Artificial Neural Networks and Optuna Automatic Hyperparameter Tuning
1

Section 01

【Introduction】Core Analysis of the Employee Salary Prediction System Based on ANN and Optuna

This article introduces a deep learning-based salary prediction project that implements a complete machine learning workflow from data cleaning to model deployment using Artificial Neural Networks (ANN) combined with Optuna automatic hyperparameter optimization. The project aims to solve the problem that traditional salary prediction methods struggle to capture complex nonlinear relationships, providing high-precision support for human resource management decisions.

2

Section 02

Project Background and Core Objectives

In the field of human resource management, salary prediction is a key issue for enterprise decision-making. Traditional methods rely on empirical judgment or simple statistical models, which are difficult to capture complex nonlinear relationships. The core objective of this project is to predict employee salaries using deep learning technology (ANN), as it can learn complex patterns and nonlinear relationships in data (salary is influenced by multiple interwoven factors such as educational background, work experience, and job level). The project adopts a complete machine learning engineering workflow and introduces Optuna to implement automatic hyperparameter tuning to improve model performance.

3

Section 03

Data Preprocessing and Feature Engineering

Data quality determines the upper limit of the model, so the project conducts systematic cleaning:

  1. Missing Value Handling: Numeric columns are filled with the mean, categorical columns with the mode to preserve distribution characteristics and avoid information loss;
  2. Label Standardization: Unify educational level annotations (e.g., "phD" and "PhD") to ensure accurate feature encoding;
  3. Feature Engineering: Create the "is_mid_career" binary flag (30-45 years old), "edu_rank" ordinal encoding (mapping from high school to doctorate levels), and "modal_edu_in_title" feature (median educational level for the position);
  4. Outlier Handling: Use the IQR method to cap outliers instead of deleting them, preserving information from abnormal samples while preventing extreme values from affecting the model.
4

Section 04

Feature Encoding and Model Architecture Design

Feature Encoding: Categorical features are one-hot encoded using pandas get_dummies (drop_first=True to avoid multicollinearity); numeric features are standardized using StandardScaler (the scaler is saved for the inference phase). The dataset is split into training/test sets in an 80/20 ratio (random_state=42 ensures reproducibility). ANN Architecture: A Multi-Layer Perceptron (MLP) is used. The input layer receives preprocessed features, two hidden layers use ReLU activation (to mitigate gradient vanishing), and the output layer has a single neuron that directly outputs the salary value. The architecture is simple and effective.

5

Section 05

Optuna Automatic Hyperparameter Optimization

Optuna framework is introduced for automated hyperparameter tuning:

  • The search space includes: number of neurons in the two hidden layers, optimizer type (Adam/SGD/RMSprop/Adagrad), learning rate (log-uniform distribution), number of training epochs, and batch size;
  • A Bayesian optimization strategy is used, with 20 trials to find the optimal configuration, and the optimization target is to maximize the test set R² score;
  • Compared to grid/random search, Bayesian optimization explores the parameter space more intelligently, saving time and finding better combinations.
6

Section 06

Model Training, Evaluation, and Technology Stack

Training and Evaluation: The model is retrained using the optimal parameters from Optuna. A custom evaluate function reports the R² scores of the training/test sets (to detect overfitting). Evaluation metrics include R² (proportion of explained variance) and MAE (training loss, robust to outliers). Technology Stack: pandas/numpy (data processing), matplotlib/seaborn (visualization), tensorflow (deep learning framework), optuna (hyperparameter tuning), sklearn (preprocessing and evaluation). It covers the complete lifecycle, and saving the StandardScaler reflects forward-looking deployment considerations.

7

Section 07

Practical Insights and Extension Directions

Core Insights: Deep learning requires systematic data preprocessing, reasonable architecture design, and scientific hyperparameter optimization, rather than stacking layers. Extension Directions: Introduce more features such as geographic location/industry category; try complex architectures like residual connections/attention mechanisms; deploy as an API service integrated into HR management systems. This project has both educational value and practical reference, demonstrating the complete process from raw data to a reliable prediction model.