Reading

Cardiag: An Intelligent Sound-Based Car Fault Diagnosis System

Introducing the Cardiag project, an open-source system that diagnoses mechanical faults by analyzing car engine sounds using machine learning. The project employs 5 different machine learning methods, combined with a hybrid expert architecture and an ensemble voting mechanism, achieving an accuracy of 91.5% on a 9-class fault classification task.

汽车故障诊断音频分类机器学习XGBoost迁移学习集成学习声音识别智能诊断

Published 2026-06-09 20:46Recent activity 2026-06-09 20:51Estimated read 7 min

Cardiag: An Intelligent Sound-Based Car Fault Diagnosis System

Section 01

Cardiag Project Overview: An Intelligent Sound-Based Car Fault Diagnosis System

Cardiag is an open-source intelligent car fault diagnosis system that identifies mechanical faults by analyzing sounds from car engines and other components using machine learning. The project uses 5 machine learning methods, combined with a hybrid expert architecture and an ensemble voting mechanism, achieving an accuracy of 91.5% on a 9-class fault classification task.

Maintained by jlacsam, the project was released on GitHub (link: https://github.com/jlacsam/cardiag) on June 9, 2026, and its dataset is from Kaggle's Car Diagnostics Dataset.

Section 02

Project Background and Problem Definition

Traditional car fault diagnosis relies on professional technicians' experience and expensive equipment, making it difficult for ordinary car owners to detect problems early and at high cost. Cardiag's sound analysis solution advantages:

Non-invasive: No need for disassembly or connection to diagnostic equipment
Low cost: Only requires recording devices and computing resources
Early warning: Detects faults before they worsen
Easy deployment: Can be integrated into mobile applications

Section 03

Technical Solution Overview

Task Definition

Classify sound recordings into 9 fault categories, covering 3 vehicle states:

State	Fault Categories
Braking	Normal braking, worn brake pads
Idling	Normal idling, insufficient oil, power steering issues, timing belt failure
Starting	Normal starting, low battery, ignition system failure

Dataset Details

Original samples: 949 WAV files
After augmentation: 1967 (to address class imbalance)
Split: 70% training /15% validation /15% testing (stratified sampling)

Section 04

Model and Architecture Comparison

Five Machine Learning Methods

XGBoost: Handcrafted features (MFCC/Delta/Chroma, etc.), accuracy 88.5% (best single model)
CNN: Mel spectrogram input, accuracy 8.1% (poor performance)
CNN-LSTM: Spatial + temporal features, accuracy 14.5% (limited by data scale)
YAMNet Transfer: Frozen pre-trained layers, accuracy 79.1%
PANNs CNN14 Transfer: 2048-dimensional embedding, accuracy 86.2%

Hybrid Expert Architecture

Hierarchical design: First determine the state → then classify the fault. Advantages: Strong interpretability, error isolation, specialization. Results: PANNs version 83.8%, XGB version 86.5%

Ensemble Voting

Ensemble of Top3 models (XGBoost, Hybrid Expert-PANNs, Hybrid Expert-XGB), majority voting accuracy 91.5%

Section 05

Key Results and Technical Insights

Result Ranking

Rank	Model	Accuracy
1	Ensemble Voting (Top3)	91.5%
2	XGBoost	88.5%
3	Hybrid Expert (XGB)	86.5%
4	PANNs Transfer	86.2%

Insights

Traditional vs Deep Learning: XGBoost outperforms CNN due to small data + effective handcrafted features
Value of Transfer Learning: Pre-trained models (YAMNet/PANNs) are better than training from scratch
Power of Ensemble: Voting increases accuracy by 3% and reduces variance

Section 06

Application Prospects and Challenges

Potential Applications

Mobile app: Car owners record sounds for diagnosis
Repair shop assistance: Reduce misdiagnosis
Vehicle insurance: Remote vehicle condition assessment
Fleet management: Preventive maintenance

Deployment Challenges

Environmental noise interference
Differences between mobile phone microphones and professional equipment
Sound feature differences across different car models
Difficulty in identifying multiple faults simultaneously

Section 07

Open Source Value and Summary

Tech Stack

Python3.x, TensorFlow, XGBoost, Librosa, Scikit-learn

Open Source Value

Researchers: Audio classification benchmark
Developers: End-to-end reference
Educators: Teaching case
Entrepreneurs: Product prototype

Summary

Cardiag demonstrates the ML potential of sound analysis. In small data scenarios, traditional features + ensemble learning outperform end-to-end deep learning. The 91.5% accuracy provides a foundation for deployment, and the hybrid expert architecture ensures interpretability, making it an excellent reference case for audio AI.