Zing Forum

Reading

Cardiag: An Intelligent Sound-Based Car Fault Diagnosis System

Introducing the Cardiag project, an open-source system that diagnoses mechanical faults by analyzing car engine sounds using machine learning. The project employs 5 different machine learning methods, combined with a hybrid expert architecture and an ensemble voting mechanism, achieving an accuracy of 91.5% on a 9-class fault classification task.

汽车故障诊断音频分类机器学习XGBoost迁移学习集成学习声音识别智能诊断
Published 2026-06-09 20:46Recent activity 2026-06-09 20:51Estimated read 7 min
Cardiag: An Intelligent Sound-Based Car Fault Diagnosis System
1

Section 01

Cardiag Project Overview: An Intelligent Sound-Based Car Fault Diagnosis System

Cardiag is an open-source intelligent car fault diagnosis system that identifies mechanical faults by analyzing sounds from car engines and other components using machine learning. The project uses 5 machine learning methods, combined with a hybrid expert architecture and an ensemble voting mechanism, achieving an accuracy of 91.5% on a 9-class fault classification task.

Maintained by jlacsam, the project was released on GitHub (link: https://github.com/jlacsam/cardiag) on June 9, 2026, and its dataset is from Kaggle's Car Diagnostics Dataset.

2

Section 02

Project Background and Problem Definition

Traditional car fault diagnosis relies on professional technicians' experience and expensive equipment, making it difficult for ordinary car owners to detect problems early and at high cost. Cardiag's sound analysis solution advantages:

  • Non-invasive: No need for disassembly or connection to diagnostic equipment
  • Low cost: Only requires recording devices and computing resources
  • Early warning: Detects faults before they worsen
  • Easy deployment: Can be integrated into mobile applications
3

Section 03

Technical Solution Overview

Task Definition

Classify sound recordings into 9 fault categories, covering 3 vehicle states:

State Fault Categories
Braking Normal braking, worn brake pads
Idling Normal idling, insufficient oil, power steering issues, timing belt failure
Starting Normal starting, low battery, ignition system failure

Dataset Details

  • Original samples: 949 WAV files
  • After augmentation: 1967 (to address class imbalance)
  • Split: 70% training /15% validation /15% testing (stratified sampling)
4

Section 04

Model and Architecture Comparison

Five Machine Learning Methods

  1. XGBoost: Handcrafted features (MFCC/Delta/Chroma, etc.), accuracy 88.5% (best single model)
  2. CNN: Mel spectrogram input, accuracy 8.1% (poor performance)
  3. CNN-LSTM: Spatial + temporal features, accuracy 14.5% (limited by data scale)
  4. YAMNet Transfer: Frozen pre-trained layers, accuracy 79.1%
  5. PANNs CNN14 Transfer: 2048-dimensional embedding, accuracy 86.2%

Hybrid Expert Architecture

Hierarchical design: First determine the state → then classify the fault. Advantages: Strong interpretability, error isolation, specialization. Results: PANNs version 83.8%, XGB version 86.5%

Ensemble Voting

Ensemble of Top3 models (XGBoost, Hybrid Expert-PANNs, Hybrid Expert-XGB), majority voting accuracy 91.5%

5

Section 05

Key Results and Technical Insights

Result Ranking

Rank Model Accuracy
1 Ensemble Voting (Top3) 91.5%
2 XGBoost 88.5%
3 Hybrid Expert (XGB) 86.5%
4 PANNs Transfer 86.2%

Insights

  • Traditional vs Deep Learning: XGBoost outperforms CNN due to small data + effective handcrafted features
  • Value of Transfer Learning: Pre-trained models (YAMNet/PANNs) are better than training from scratch
  • Power of Ensemble: Voting increases accuracy by 3% and reduces variance
6

Section 06

Application Prospects and Challenges

Potential Applications

  1. Mobile app: Car owners record sounds for diagnosis
  2. Repair shop assistance: Reduce misdiagnosis
  3. Vehicle insurance: Remote vehicle condition assessment
  4. Fleet management: Preventive maintenance

Deployment Challenges

  • Environmental noise interference
  • Differences between mobile phone microphones and professional equipment
  • Sound feature differences across different car models
  • Difficulty in identifying multiple faults simultaneously
7

Section 07

Open Source Value and Summary

Tech Stack

Python3.x, TensorFlow, XGBoost, Librosa, Scikit-learn

Open Source Value

  • Researchers: Audio classification benchmark
  • Developers: End-to-end reference
  • Educators: Teaching case
  • Entrepreneurs: Product prototype

Summary

Cardiag demonstrates the ML potential of sound analysis. In small data scenarios, traditional features + ensemble learning outperform end-to-end deep learning. The 91.5% accuracy provides a foundation for deployment, and the hybrid expert architecture ensures interpretability, making it an excellent reference case for audio AI.