Zing Forum

Reading

Human Activity Recognition Based on WISDM Dataset: A Comparative Study of Traditional Machine Learning and Deep Learning Methods

This article presents a comparative study evaluating the performance of Random Forest, CNN, and CNN-LSTM hybrid architecture in human activity recognition using wearable device sensor data, revealing the key role of temporal modeling in activity classification.

人类活动识别深度学习CNN-LSTM可穿戴设备传感器数据机器学习时间序列分析WISDM数据集
Published 2026-06-11 10:45Recent activity 2026-06-11 10:49Estimated read 4 min
Human Activity Recognition Based on WISDM Dataset: A Comparative Study of Traditional Machine Learning and Deep Learning Methods
1

Section 01

[Introduction] Core Overview of Comparative Study on HAR Methods Based on WISDM Dataset

This article, based on the WISDM dataset, compares the performance of traditional machine learning (Random Forest) and deep learning (CNN, CNN-LSTM hybrid architecture) in Human Activity Recognition (HAR). The study reveals the key role of temporal modeling in activity classification, with the CNN-LSTM hybrid architecture performing the best.

2

Section 02

Research Background and Introduction to WISDM Dataset

Human Activity Recognition (HAR) is a core technology in mobile computing and health monitoring, which needs to handle the complexity and noise interference of sensor data. The WISDM dataset contains accelerometer and gyroscope data from smartphones/smartwatches, covering 18 daily activities, making it an ideal benchmark for evaluating HAR algorithms.

3

Section 03

Design Details of Three Model Architectures

  1. Random Forest: Relies on manual feature engineering, extracting time-domain (mean, variance, etc.) and frequency-domain (FFT) features;
  2. CNN: End-to-end learning, extracting spatial features via convolutional layers;
  3. CNN-LSTM: Combines CNN's spatial feature extraction and LSTM's temporal modeling ability to capture spatial patterns and temporal dynamics of activities.
4

Section 04

Experimental Results and Performance Comparison

Test set performance: Random Forest accuracy is 50.6%, CNN 49.3%, CNN-LSTM 69.8%. CNN-LSTM is significantly better than other models, with an improvement of nearly 20 percentage points, verifying the importance of temporal modeling.

5

Section 05

Result Analysis and Key Insights

The advantage of CNN-LSTM comes from its ability to capture both spatial patterns and temporal dependencies, enabling it to distinguish similar activities and be robust to noise. Pure CNN performs poorly because it cannot model long-term temporal dependencies; Random Forest, although with low accuracy, is fast to train and highly interpretable, making it suitable for resource-constrained scenarios.

6

Section 06

Technical Implementation and Application Prospects

Implementation tech stack: Python (using Pandas/NumPy for data processing, Scikit-learn for Random Forest implementation, TensorFlow/Keras for building deep learning models, etc.). Application prospects include health monitoring, sports analysis, etc.; future directions: multi-modal fusion, lightweight models, transfer learning, real-time processing.