Reading

Multimodal Deep Learning Breaks Through T Cell Functional State Prediction: A New Method Integrating Gene Expression and TCR Sequences

This article introduces a multimodal deep learning model that fuses single-cell RNA sequencing and T cell receptor sequencing data. By integrating gene expression profiles, TCR sequence embeddings, and V/J gene usage information via a bidirectional cross-attention mechanism, it achieves high-precision classification of T cell functional states.

T细胞多模态深度学习单细胞测序TCR基因表达免疫学肿瘤免疫交叉注意力PyTorch

Published 2026-05-16 07:54Recent activity 2026-05-16 08:17Estimated read 6 min

Multimodal Deep Learning Breaks Through T Cell Functional State Prediction: A New Method Integrating Gene Expression and TCR Sequences

Section 01

[Introduction] Multimodal Deep Learning Breaks Through T Cell Functional State Prediction: A New Method Integrating Gene Expression and TCR Sequences

Accurate identification of T cell functional states is crucial in tumor immunotherapy and autoimmune disease research. Recently, the open-source multimodal-tcell-classifier project proposed an innovative multimodal deep learning architecture. By integrating gene expression profiles, TCR sequence embeddings, and V/J gene usage information via a bidirectional cross-attention mechanism, it achieves high-precision classification of seven T cell functional states, providing a practical tool for single-cell multi-omics analysis.

Section 02

Research Background and Challenges

T cells are core executors of the adaptive immune system, and their functional states determine the effectiveness of immune responses. However, the same TCR sequence may correspond to different functional states such as effector, memory, or exhausted, so function cannot be determined by TCR alone. Traditional unimodal methods have obvious limitations: classification accuracy is only 33.7% using TCR sequences alone, and 69.9% using gene expression alone. There is a need to fuse multi-source data to capture the complete biological picture.

Section 03

Model Architecture and Training Strategy

The model uses a bidirectional cross-attention fusion mechanism. Inputs include 3000 highly variable gene expression profiles, TCR-BERT embeddings (CDR3α/β), and one-hot encoding of V/J genes. Gene expression is dimensionality-reduced via two encoder layers, and TCR embeddings are extracted using pre-trained TCR-BERT. The bidirectional cross-attention layer achieves deep modal fusion. Training uses an ensemble of 8 models (soft voting), with data from 4 public datasets (136,000 cells). Techniques such as AdamW optimization, cosine annealing learning rate, and label smoothing are applied. The final ensemble model achieves an internal test accuracy of 89.6% and a macro F1 score of 0.88.

Section 04

Functional State Classification and Generalization Performance

The model classifies T cells into 7 states: Treg (markers like FOXP3, F1=0.94), effector T (markers like GZMB, F1=0.91), proliferative phase (markers like MKI67, F1=0.90), memory T (markers like IL7R, F1=0.89), naive T (markers like CCR7, F1=0.86), exhausted T (markers like PDCD1, F1=0.83), and Th_effector (F1=0.75). In external validation: accuracy is 86.4% on non-small cell lung cancer datasets, 67.2% on glioblastoma (poor exhausted T classification), and 62.6% on skin cancer (blurred boundary between naive and memory T).

Section 05

Comparative Analysis and Application Tools

Ablation experiments show: accuracy is 33.7% with TCR alone, 69.9% with gene expression alone, 79.3% when adding TCR embeddings, 88.1% when adding V/J and complete gene expression, 0.7% improvement from cross-attention, and an additional 0.8% from ensemble. Compared with XGBoost: XGBoost is slightly better in internal tests (90.6%), but the neural network generalizes better in external cohorts (leading by 8.2% in non-small cell lung cancer). The tool ecosystem includes pip installation, the predict_report.py script, automatic weight downloading, outputs like predictions.csv, annotated.h5ad, and interactive reports, and supports Python API integration.

Section 06

Limitations and Future Directions

Model limitations: The 7-category framework mixes lineage, function, and cell cycle dimensions; cross-tissue generalization is unstable (e.g., poor exhausted T classification in glioblastoma); false positive proliferative phase predictions. Future directions: Develop tissue-specific models, hierarchical classification strategies, and improve normalization and domain adaptation techniques.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15