Zing Forum

Reading

SZAtt-Net: A Multimodal Deep Learning Model Integrating Attention Mechanism for Schizophrenia Classification

SZAtt-Net is a novel deep learning framework that integrates Conv2D, BiGRU, and attention mechanisms to classify schizophrenia using EEG and MRI data, achieving an accuracy of over 96% on multiple benchmark datasets.

schizophreniaEEGMRIdeep learningattention mechanismneuroimagingmultimodalBiGRUConv2Dpsychiatric AI
Published 2026-05-16 18:14Recent activity 2026-05-16 18:19Estimated read 6 min
SZAtt-Net: A Multimodal Deep Learning Model Integrating Attention Mechanism for Schizophrenia Classification
1

Section 01

Introduction: SZAtt-Net - A Multimodal Model Integrating Attention Mechanism for Schizophrenia Classification

SZAtt-Net is a novel deep learning framework integrating Conv2D, BiGRU, and attention mechanisms. It uses EEG and MRI multimodal data to classify schizophrenia, achieving an accuracy of over 96% on multiple benchmark datasets, providing a new path for the objective diagnosis of mental disorders.

2

Section 02

Research Background and Challenges

Schizophrenia is highly heterogeneous. Traditional diagnosis relies on subjective assessments, leading to issues like inconsistent standards and difficulty in early identification. EEG (high temporal resolution) and MRI (structural/functional information) can complementarily reveal the neural basis of the disease, but effectively integrating multimodal data to extract diagnostic features is a core challenge in computational psychiatry. Most existing studies focus on a single modality or fail to fully utilize complementary information.

3

Section 03

Detailed Architecture of the SZAtt-Net Model

Core Components

  • Conv2D Convolutional Layer: Learn spatial features from EEG topographic maps and MRI slices, detecting local patterns (e.g., EEG frequency distribution, MRI brain region abnormalities);
  • BiGRU Bidirectional Gated Recurrent Unit: Capture EEG temporal dependencies; gating mitigates gradient vanishing; bidirectional design leverages past/future context;
  • Attention Mechanism: Automatically focus on feature regions relevant to classification, simulating experts' image-reading patterns to identify specific neural markers.

Multimodal Fusion Strategy

After converting EEG to topographic maps, process via Conv2D/BiGRU/attention; MRI directly uses convolutional layers to detect morphological abnormalities; features from both modalities are fused at the high level to integrate complementary information.

4

Section 04

Experimental Results and Performance Evaluation

Test results on three benchmark datasets:

  • Kaggle EEG Dataset: 99.37% accuracy, perfectly distinguishing patients from healthy individuals;
  • LMSU EEG Dataset: 98.92% accuracy, proving generalization ability is not limited by equipment or experimental conditions;
  • Hippocampal MRI Dataset: 96.33% accuracy, marking the first application of deep learning to MRI-based schizophrenia classification.
5

Section 05

Technical Innovations and Academic Contributions

  1. Attention Mechanism Comparison: Systematically analyze different attention variants, finding that targeted selection can improve performance, providing guidance for neuroimaging analysis;
  2. Deep Learning Breakthrough in MRI: Fill the gap of pure deep learning application in MRI classification; end-to-end learning discovers more subtle morphological changes;
  3. Unified Multimodal Framework: The same architecture seamlessly processes EEG/MRI, lowering the threshold for multimodal research and laying the foundation for integrating more data types.
6

Section 06

Clinical Significance and Application Prospects

Auxiliary Diagnostic Tool

High accuracy can serve as a clinical "second opinion", improving diagnostic objectivity and early identification ability, especially suitable for resource-poor areas;

Disease Mechanism Research

Identify key brain regions/neural patterns through attention weight analysis, helping to reveal the neurobiological mechanisms of the disease;

Treatment Response Prediction

Can be extended to predict treatment responses in the future, enabling personalized medicine.

7

Section 07

Limitations and Future Directions

Current Limitations

  • Small dataset size;
  • Cross-center generalization ability needs verification;
  • Lack of longitudinal tracking data.

Future Directions

  • Explore EEG/MRI hybrid models;
  • Introduce Grad-CAM/SHAP to enhance interpretability;
  • Expand to diverse datasets and explore semi-supervised/self-supervised learning.