Zing Forum

Reading

Multimodal Emotion Recognition in Medical Scenarios: Cross-Age Robustness of Valence Dimensions vs. Labels

The THERADIA-WoZ study, by comparing emotional data from elderly and young people, found that emotion recognition models based on valence dimensions are significantly superior to traditional categorical label methods in cross-age generalization.

情感识别多模态学习评价理论医疗AI跨年龄泛化认知训练情感计算
Published 2026-04-30 22:37Recent activity 2026-05-01 12:51Estimated read 6 min
Multimodal Emotion Recognition in Medical Scenarios: Cross-Age Robustness of Valence Dimensions vs. Labels
1

Section 01

Introduction: Cross-Age Robustness Advantages of Valence Dimension Emotion Recognition Models

Core findings of this article: The THERADIA-WoZ study, by comparing emotional data from elderly and young people, confirms that emotion recognition models based on valence dimensions are significantly superior to traditional categorical label methods in cross-age generalization. This conclusion has important guiding significance for the design of emotional interaction systems in AI healthcare scenarios (e.g., computerized cognitive training).

2

Section 02

Research Background: Key Challenges in Emotion Recognition for AI Healthcare

AI is widely applied in the medical field, but emotion recognition in human-computer interaction scenarios (e.g., Computerized Cognitive Training, CCT) still faces challenges. Traditional emotion recognition uses discrete categorical labels (such as happy, sad), which easily lose emotional details, and different groups have large differences in understanding labels, affecting the generalization ability of models. CCT systems need to accurately recognize emotions to adjust training content; otherwise, the effect will be reduced or negative experiences will be brought.

3

Section 03

Research Methods: Cross-Age Dataset and Valence Theory Framework

  1. Cross-Age Dataset: Expand the THERADIA-WoZ corpus, add new young people's data to support cross-age generalization evaluation;
  2. Valence Theory Perspective: Introduce valence dimensions (pleasure, arousal, dominance) as emotional representations instead of discrete labels to capture continuous emotional features;
  3. Experimental Design: Set three evaluation scenarios—intra-corpus (same-age testing), cross-corpus (cross-age testing), and mixed corpus (mixed data training) to comprehensively test generalization ability.
4

Section 04

Core Evidence: Cross-Age Generalization Advantages of Valence Dimension Models

Experimental results show:

  • In all scenarios, the performance of valence dimension models is better than that of categorical label models;
  • In cross-corpus evaluation, the performance of categorical label models drops sharply to random levels, while valence dimension models still maintain accuracy significantly higher than random;
  • Mixed data training does not improve generalization ability, indicating that the advantage comes from the cross-group stability of the representation method itself.
5

Section 05

Clinical Application Value and Practical Guidance

Guidance for AI healthcare:

  1. Representation Method Selection: Prioritize valence dimensions over categorical labels, especially for diverse user groups;
  2. Model Deployment: Valence dimension models are suitable for cross-age deployment, reducing separate training costs;
  3. System Feedback: Continuous dimensions support detailed feedback, and training parameters can be adjusted according to pleasure and arousal; In addition, multimodal fusion (voice + facial expressions) improves recognition accuracy, and valence dimensions provide a unified framework for multimodal integration; The research team also opened a time-continuous emotion prediction API to promote follow-up research.
6

Section 06

Future Research Directions and Prospects

Future explorations can include:

  • Expand to more age groups such as children and adolescents to establish a full-life-cycle model;
  • Verify the generalization ability of valence dimensions in different cultural backgrounds;
  • Link emotion recognition accuracy with the clinical effect of cognitive training;
  • Optimize model efficiency to support real-time online CCT scenarios.