Section 01
[Introduction] Application and Challenges of Frozen Multimodal Embeddings in AVI Psychological Assessment
The research team proposes using frozen multimodal encoders (CLIP, Whisper, RoBERTa, etc.) for personality and cognitive ability assessment in asynchronous video interviews (AVI). They achieved results significantly better than the baseline in the ACM Multimedia AVI Challenge 2026, while revealing potential dataset shortcut issues in cognitive ability prediction.