Reading

EXIST 2026: A Multimodal Sexism Detection System Integrating Eye-Tracking, Heart Rate, and EEG Signals

A People-Centered Multimodal Sexism Detection Study Combining Eye-Tracking, Heart Rate Monitoring, EEG, and Vision-Language Models

多模态学习性别歧视检测眼动追踪EEG心率监测内容审核TikTok视觉语言模型AI安全

Published 2026-06-06 17:27Recent activity 2026-06-06 17:57Estimated read 9 min

EXIST 2026: A Multimodal Sexism Detection System Integrating Eye-Tracking, Heart Rate, and EEG Signals

Section 01

[Introduction] EXIST2026: A Multimodal Sexism Detection System Integrating Physiological Signals and Vision-Language Models

Project Overview

Original Author/Maintainer: ivanarcos02
Source Platform: GitHub
Release Time: June 2026
Core Direction: EXIST 2026 Challenge on "People-Centered Multimodal Sexism Detection"

Core Innovation

Combines eye-tracking, heart rate monitoring, EEG signals, and Vision-Language Models (VLM) to build a multimodal system. It uses human physiological and cognitive responses to assist in sexism content detection, breaking through the limitations of traditional text/image analysis.

Application Scenarios

Applicable to content moderation on social media platforms like TikTok, exploring a new "people-centered" paradigm in AI safety.

Section 02

Research Background and Problem Definition

Limitations of Traditional Detection

Traditional sexism detection relies on text analysis or image recognition, ignoring the real physiological and cognitive responses when humans perceive discriminatory content.

Background of EXIST Challenge

EXIST (Sexism Identification in Social Networks) is an IberLEF series evaluation task. The 2026 direction focuses on "people-centered multimodal detection", with the core hypothesis: Humans produce measurable physiological responses when viewing discriminatory content, which can be used as detection signals.

Core Problem

How to integrate physiological signals with AI models to achieve more accurate sexism detection that aligns with human feelings?

Section 03

Core Innovation: Integration of Multimodal Physiological Signals and VLM

Physiological Signal Collection

Eye-Tracking: Analyze fixation distribution, saccade paths, pupil changes, and regression behavior to reflect attention allocation and emotional arousal.
Heart Rate Monitoring: Capture autonomic nervous system responses via heart rate variability (HRV), heart rate acceleration, and temporal correlation.
EEG: Extract event-related potentials (ERP), spectral features, and brain region activation to directly measure neural activity.

Vision-Language Model (VLM)

Integrates models like CLIP/BLIP to achieve video frame understanding, cross-modal alignment, context modeling, and extract visual-semantic features.

Fusion Logic

Combine physiological signals with VLM features to build a multimodal detection system, compensating for the shortcomings of single modalities.

Section 04

Analysis of Technical Implementation Architecture

Preprocessing Pipeline

Time Synchronization: Align physiological signals with the video timeline
Signal Filtering: Remove noise and artifacts
Feature Extraction: Extract valid features from raw signals
Data Cleaning: Handle missing values and outliers

Prompt Engineering

Design prompt templates to clarify task definition, fine-grained labels (e.g., direct discrimination, micro-discrimination), and context information utilization.

Experimental Configuration

Provide hyperparameter settings, training strategies, and evaluation metrics suitable for sexism detection.

Section 05

Scientific Value and Practical Significance

Methodological Innovation

First large-scale application of physiological signals in social media content moderation, pioneering a new "people-centered" AI safety research paradigm that can be extended to other harmful content identification.

Theoretical Contribution

Explore the neurophysiological mechanisms of human perception of sexism, population differences, and consistency between subjective reports and objective indicators.

Practical Value

Identify gray-area content
Understand the reasons for user discomfort
Optimize content recommendation algorithms

Section 06

Technical Challenges and Solutions

Data Alignment Difficulty

Different modalities have large sampling rate differences (video: 30fps / heart rate: 1Hz / EEG: 1000Hz). Solution: Sliding window + interpolation technology to unify the time grid.

Individual Differences

Physiological responses vary by person. Solution: Individual normalization + transfer learning to balance generalization and individual differences.

Data Sparsity

Labeled physiological data is scarce. Solution: Semi-supervised learning + data augmentation to make full use of limited data.

Section 07

Ethical Considerations and Data Privacy

Informed Consent

Subjects must fully understand the experiment's purpose (including possible exposure to uncomfortable content) and participate voluntarily.

Data Privacy

Physiological data (especially EEG) is highly identifiable, requiring strict protective measures.

Research Ethics

Balance research value with participants' psychological impact, and set up psychological support mechanisms.

Application Ethics

Alert to technical abuse (e.g., emotional manipulation, improper moderation) and clarify the boundaries of legitimate use.

Section 08

Future Directions and Summary

Future Development

Expand Modalities: Add galvanic skin response (GSR), facial expression recognition, and speech emotion analysis
Real-Time Detection: Develop an instant moderation system for live content
Cross-Platform/Cultural: Verify the generalization of the method on other platforms and in different cultural contexts

Summary

This research breaks through the traditional content moderation paradigm by integrating physiological signals with AI, enabling the system to better understand human feelings. It provides a new direction for AI safety and content platform moderation, and the technology can be extended to mental health, education, and other fields with broad prospects.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Building an AWS Generative AI Application from Scratch: EC2 + Bedrock Hands-On Tutorial

A complete cloud-native AI application development guide for beginners, building a simple generative AI chatbot using Amazon EC2, Apache, Python CGI, and Amazon Bedrock, covering architecture design, IAM permission configuration, security best practices, and cost optimization suggestions.

Recent activity 2026-06-02 19:49