Reading

Neuro-JEPA: A Foundation Model for Sparse Latent Variable Prediction in Multimodal Neuroimaging

The NYU Medical Machine Learning Lab open-sourced Neuro-JEPA, applying the JEPA architecture to neuroimaging analysis and enabling self-supervised learning of multimodal brain images via sparse latent variable prediction.

Neuro-JEPA神经影像自监督学习JEPA多模态稀疏表征医学影像脑影像深度学习表征学习

Published 2026-06-13 05:58Recent activity 2026-06-13 06:21Estimated read 8 min

Neuro-JEPA: A Foundation Model for Sparse Latent Variable Prediction in Multimodal Neuroimaging

Section 01

[Introduction] Neuro-JEPA: Open-Source Foundation Model for Sparse Latent Variable Prediction in Multimodal Neuroimaging

The NYU Medical Machine Learning Lab (NYUMedML) open-sourced Neuro-JEPA on GitHub on June 12, 2026, applying the JEPA (Joint Embedding Predictive Architecture) to neuroimaging analysis and enabling self-supervised learning of multimodal brain images via sparse latent variable prediction. Optimized for the characteristics of neuroimaging, this model supports multimodal data such as MRI, fMRI, and PET, aiming to address issues like scarce labeled data and difficult modal alignment, and provides high-quality representations for downstream tasks such as brain region segmentation and disease classification.

Section 02

[Background] Challenges in Neuroimaging Analysis and Introduction of the JEPA Architecture

Neuroimaging faces challenges such as scarce labeled data, difficult inter-modal alignment, and high-dimensional data processing. Traditional supervised learning relies on large amounts of manual annotation, which is costly and requires the participation of professional physicians. Self-supervised learning (SSL) learns representations from unlabeled data through pre-training tasks, providing ideas to solve these dilemmas. The JEPA architecture (e.g., I-JEPA, V-JEPA) has been successful in the computer vision field; its core is to predict latent space representations rather than pixels, avoiding the limitations of pixel-level reconstruction (such as redundant details and high computational overhead) and being more suitable for learning semantic features. Neuro-JEPA introduces this concept into the neuroimaging field and optimizes it.

Section 03

[Methodology] Architectural Design of Neuro-JEPA and Sparse Latent Variable Prediction Mechanism

Neuro-JEPA is optimized for the characteristics of neuroimaging: 1. 3D Volume Processing: Uses 3D patch division and attention mechanisms to capture cross-slice anatomical correlations; 2. Multimodal Fusion: Learns cross-modal shared features through a modality-agnostic representation space; 3. Sparse Latent Variable Prediction: Core innovation, predicting sparse latent variable activations to enhance interpretability, improve efficiency, and boost generalization (achieved via L1 regularization or gating mechanisms); 4. Anatomical Structure Awareness: Introduces anatomical priors (e.g., brain region segmentation maps) to learn meaningful representations. The sparsity constraint formula is: L = ||z_t - Decoder(h)||² + λ||h||₁ (λ controls the degree of sparsity).

Section 04

[Evidence] Experimental Validation: Performance of Neuro-JEPA in Downstream Tasks

Neuro-JEPA was pre-trained on large-scale datasets such as ADNI, UK Biobank, and ABCD. Performance in downstream tasks: 1. Brain Region Segmentation: Dice coefficient in FreeSurfer tasks improved by 3-5% compared to MAE and over 15% compared to random initialization; 2. Disease Diagnosis: AUC for Alzheimer's disease classification on ADNI reached 0.92, outperforming existing self-supervised methods; 3. Cross-Modal Transfer: Models pre-trained on structural MRI still performed well when transferred to fMRI tasks. Ablation experiments confirmed the importance of sparsity constraints (performance dropped by 5% after removal), 3D processing (superior to 2D slices), and multimodal pre-training (superior to unimodal).

Section 05

[Applications & Open Source] Downstream Task Applications and Open-Source Resources of Neuro-JEPA

Downstream application scenarios include brain region segmentation, disease classification, image registration, and generative tasks (e.g., missing modality completion). The open-source codebase is modularly designed, including data processing (supports NIfTI/CIFTI formats), model implementation (PyTorch-based 3D ViT + sparse predictor), pre-training scripts, and downstream task examples. Pre-trained weights for ADNI/UK Biobank are provided, and users can quickly use them following steps: environment preparation → data processing → pre-training/loading weights → fine-tuning.

Section 06

[Contributions & Outlook] Innovations of Neuro-JEPA and Future Research Directions

Innovations: 1. First systematic application of JEPA to neuroimaging; 2. Proposed sparse latent variable prediction mechanism; 3. Achieved unified multimodal representation. Limitations: High computational resource requirements, insufficient handling of data heterogeneity, and limited coverage of downstream tasks. Future directions: Cross-dataset pre-training, temporal modeling (capturing disease progression), clinical data fusion (imaging + genomics/cognitive tests), and development of interpretability enhancement tools.

Section 07

[Summary] Significance of Neuro-JEPA for Self-Supervised Learning in Neuroimaging

By combining the JEPA architecture with sparse latent variable prediction, Neuro-JEPA enables high-quality multimodal brain image representation learning and provides a practical tool for neuroimaging analysis. The release of open-source code and pre-trained models is expected to promote more researchers to conduct follow-up studies and accelerate the clinical implementation of neuroimaging AI.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

libmlxforge: An Embedded MLX LLM Inference Engine for Apple Silicon

libmlxforge is an embeddable MLX large language model (LLM) inference engine designed specifically for Apple Silicon. It provides a unified C ABI interface, supports calls from Node.js, Swift, and Rust, and features continuous batching, streaming output, JSON-constrained structured output, and embedding vector generation.

Recent activity 2026-06-09 17:23