Reading

LENS: An Innovative Framework for Converting Multimodal Physiological Signals into Mental Health Narratives

LENS is a framework that aligns multimodal physiological signals collected by wearable devices with large language models (LLMs), enabling the generation of clinically meaningful mental health narrative reports. Developed by research teams from Dartmouth College, the University of Virginia, and Harvard Medical School, this project has achieved end-to-end conversion from raw time-series signals to natural language descriptions by building a dataset of over 100,000 sensor-text paired entries.

LENS心理健康多模态感知大型语言模型时间序列可穿戴设备生态瞬时评估数字健康临床叙事生成传感器数据对齐

Published 2026-04-16 02:38Recent activity 2026-04-16 02:48Estimated read 8 min

LENS: An Innovative Framework for Converting Multimodal Physiological Signals into Mental Health Narratives

Section 01

LENS Framework: Innovation in Mental Health Narratives Linking Multimodal Physiological Signals and LLMs

LENS (LLM-Enabled Narrative Synthesis) is an innovative framework developed by teams from Dartmouth College, the University of Virginia, and Harvard Medical School. It aims to align multimodal physiological signals collected by wearable devices with large language models (LLMs) to generate clinically meaningful mental health narrative reports. This framework addresses the limitations of traditional mental health assessments (such as reliance on retrospective reports and heavy clinical burden) and the technical gap where existing LLMs cannot directly process time-series data. By building a dataset of over 100,000 sensor-text paired entries, it实现 end-to-end conversion from raw signals to natural language narratives, providing a new path for the digital mental health field.

Section 02

Background: Digital Challenges in Mental Health Assessment

Mental health issues are a key global public health concern. In the U.S., approximately 18% of adults are affected by anxiety and 9.5% experience depression each year. Traditional assessments rely on structured interviews and self-report scales (e.g., PHQ-9, GAD-7), but have limitations such as heavy clinical burden, reliance on retrospective reports, and difficulty capturing real-world behavioral patterns. The popularity of wearable technology provides new possibilities for monitoring (linking behavioral/physiological signals to symptoms), and Ecological Momentary Assessment (EMA) can capture intra-day fluctuations. However, converting massive sensor data into clinically usable information remains a challenge, and existing LLMs' inability to directly process time-series data limits their application.

Section 03

Methodology: Construction of High-Quality Sensor-Text Dataset

The research team conducted a 90-day longitudinal study, recruiting 258 participants with major depressive disorder. Participants wore Garmin vivoactive3 watches and used a mobile app, which pushed EMA questionnaires three times daily (13 adapted items from PHQ-9 and GAD-7, scored 0-100). At the same time, sensor signals including GPS, steps, accelerometer, call duration, lock screen events, heart rate, sleep, and stress were recorded. EMA completion times were aligned with the previous 4 hours of data to build a dataset of 50,957 samples. Text annotation was generated through template mapping (converting EMA questions and answers into frequency phrases), rewriting with Qwen2.5-14B (to improve fluency and diversity), and multi-agent LLM quality control (to ensure accuracy, completeness, and clinical relevance).

Section 04

Methodology: Alignment Model Architecture with Patch-Level Encoder

The core of LENS is a sensor-text alignment method: it uses a patch-level time-series encoder to split continuous signals into fixed-length patches, then applies linear transformation to generate vectors with the same dimension as text word embeddings. Sensor embeddings and question text embeddings are interleaved and input into the LLM, which is trained via two-stage curriculum learning (first stage: alignment; second stage: fine-tuning for narrative generation). The advantages of this design include avoiding context limitations and precision loss of serialized text, eliminating plotting bias, and allowing LLMs to natively understand time-series patterns.

Section 05

Evidence: Experimental Validation and Clinical Evaluation Results

In quantitative evaluation, LENS outperformed baseline models in NLP metrics (BLEU, ROUGE, BERTScore) and symptom severity accuracy metrics. An expert user study (13 mental health professionals evaluating 117 samples) showed that the generated narratives are comprehensive and clinically meaningful, providing references for decision-making. Experts praised its ability to integrate scattered data into coherent descriptions of personal status, which are easier to understand and communicate than scale scores.

Section 06

Conclusions and Application Prospects

LENS is an important advance in the digital mental health field, marking the first time deep alignment between multimodal physiological signals and LLMs has been achieved, opening a new path for intelligent interpretation of health sensing data. The methodology can be extended to scenarios such as chronic disease management, elderly health monitoring, and athlete training monitoring. However, attention needs to be paid to data privacy and security (strict protection of sensitive information) and model fairness and generalization (validation on diverse populations). The team plans to make subsequent improvements.

Section 07

Summary and Outlook

Through innovative data construction processes and model architecture, LENS successfully实现 end-to-end conversion from raw sensor signals to clinical narratives. It contributes over 100,000 high-quality training data entries and a scalable technical path, supporting LLMs to directly infer behavioral signals and assist clinical decision-making. With the popularity of wearables and advances in sensors, LENS provides prospects for extracting health insights from behavioral data and presenting them in a human-understandable way, indicating that AI will become a bridge connecting quantified self and clinical care.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15