Zing Forum

Reading

LENS: An Innovative Framework for Converting Multimodal Physiological Signals into Mental Health Narratives

LENS is a framework that aligns multimodal physiological signals collected by wearable devices with large language models (LLMs), enabling the generation of clinically meaningful mental health narrative reports. Developed by research teams from Dartmouth College, the University of Virginia, and Harvard Medical School, this project has achieved end-to-end conversion from raw time-series signals to natural language descriptions by building a dataset of over 100,000 sensor-text paired entries.

LENS心理健康多模态感知大型语言模型时间序列可穿戴设备生态瞬时评估数字健康临床叙事生成传感器数据对齐
Published 2026-04-16 02:38Recent activity 2026-04-16 02:48Estimated read 8 min
LENS: An Innovative Framework for Converting Multimodal Physiological Signals into Mental Health Narratives
1

Section 01

LENS Framework: Innovation in Mental Health Narratives Linking Multimodal Physiological Signals and LLMs

LENS (LLM-Enabled Narrative Synthesis) is an innovative framework developed by teams from Dartmouth College, the University of Virginia, and Harvard Medical School. It aims to align multimodal physiological signals collected by wearable devices with large language models (LLMs) to generate clinically meaningful mental health narrative reports. This framework addresses the limitations of traditional mental health assessments (such as reliance on retrospective reports and heavy clinical burden) and the technical gap where existing LLMs cannot directly process time-series data. By building a dataset of over 100,000 sensor-text paired entries, it实现 end-to-end conversion from raw signals to natural language narratives, providing a new path for the digital mental health field.

2

Section 02

Background: Digital Challenges in Mental Health Assessment

Mental health issues are a key global public health concern. In the U.S., approximately 18% of adults are affected by anxiety and 9.5% experience depression each year. Traditional assessments rely on structured interviews and self-report scales (e.g., PHQ-9, GAD-7), but have limitations such as heavy clinical burden, reliance on retrospective reports, and difficulty capturing real-world behavioral patterns. The popularity of wearable technology provides new possibilities for monitoring (linking behavioral/physiological signals to symptoms), and Ecological Momentary Assessment (EMA) can capture intra-day fluctuations. However, converting massive sensor data into clinically usable information remains a challenge, and existing LLMs' inability to directly process time-series data limits their application.

3

Section 03

Methodology: Construction of High-Quality Sensor-Text Dataset

The research team conducted a 90-day longitudinal study, recruiting 258 participants with major depressive disorder. Participants wore Garmin vivoactive3 watches and used a mobile app, which pushed EMA questionnaires three times daily (13 adapted items from PHQ-9 and GAD-7, scored 0-100). At the same time, sensor signals including GPS, steps, accelerometer, call duration, lock screen events, heart rate, sleep, and stress were recorded. EMA completion times were aligned with the previous 4 hours of data to build a dataset of 50,957 samples. Text annotation was generated through template mapping (converting EMA questions and answers into frequency phrases), rewriting with Qwen2.5-14B (to improve fluency and diversity), and multi-agent LLM quality control (to ensure accuracy, completeness, and clinical relevance).

4

Section 04

Methodology: Alignment Model Architecture with Patch-Level Encoder

The core of LENS is a sensor-text alignment method: it uses a patch-level time-series encoder to split continuous signals into fixed-length patches, then applies linear transformation to generate vectors with the same dimension as text word embeddings. Sensor embeddings and question text embeddings are interleaved and input into the LLM, which is trained via two-stage curriculum learning (first stage: alignment; second stage: fine-tuning for narrative generation). The advantages of this design include avoiding context limitations and precision loss of serialized text, eliminating plotting bias, and allowing LLMs to natively understand time-series patterns.

5

Section 05

Evidence: Experimental Validation and Clinical Evaluation Results

In quantitative evaluation, LENS outperformed baseline models in NLP metrics (BLEU, ROUGE, BERTScore) and symptom severity accuracy metrics. An expert user study (13 mental health professionals evaluating 117 samples) showed that the generated narratives are comprehensive and clinically meaningful, providing references for decision-making. Experts praised its ability to integrate scattered data into coherent descriptions of personal status, which are easier to understand and communicate than scale scores.

6

Section 06

Conclusions and Application Prospects

LENS is an important advance in the digital mental health field, marking the first time deep alignment between multimodal physiological signals and LLMs has been achieved, opening a new path for intelligent interpretation of health sensing data. The methodology can be extended to scenarios such as chronic disease management, elderly health monitoring, and athlete training monitoring. However, attention needs to be paid to data privacy and security (strict protection of sensitive information) and model fairness and generalization (validation on diverse populations). The team plans to make subsequent improvements.

7

Section 07

Summary and Outlook

Through innovative data construction processes and model architecture, LENS successfully实现 end-to-end conversion from raw sensor signals to clinical narratives. It contributes over 100,000 high-quality training data entries and a scalable technical path, supporting LLMs to directly infer behavioral signals and assist clinical decision-making. With the popularity of wearables and advances in sensors, LENS provides prospects for extracting health insights from behavioral data and presenting them in a human-understandable way, indicating that AI will become a bridge connecting quantified self and clinical care.