Zing Forum

Reading

DISCO: A Multimodal Diffusion Model-Driven Protein Co-Design System

A protein design tool based on multimodal diffusion models that enables the co-generation of sequences and 3D structures, providing a new paradigm for drug discovery and synthetic biology.

蛋白质设计扩散模型多模态生物信息学药物发现合成生物学结构预测AI4Science
Published 2026-05-02 12:37Recent activity 2026-05-02 12:49Estimated read 6 min
DISCO: A Multimodal Diffusion Model-Driven Protein Co-Design System
1

Section 01

Introduction to DISCO: A Multimodal Diffusion Model-Driven Protein Co-Design System

DISCO is a protein design tool based on multimodal diffusion models, enabling the co-generation of sequences and 3D structures, and providing a new paradigm for drug discovery and synthetic biology. Its core innovations lie in the co-design paradigm and multimodal conditional mechanism, which address the problem of separation between sequence and structure in traditional protein design, and are expected to drive AI-powered biotech innovations.

2

Section 02

Background of Paradigm Shift in Protein Design

Protein function is defined by the 3D structure determined by the amino acid sequence. Traditional protein design relies on physical simulations and heuristic searches, which have high computational costs and limited success rates. In recent years, diffusion models have made breakthroughs in image generation, and scientists have explored their application in molecular design; DISCO is the latest achievement in this direction.

3

Section 03

DISCO's Co-Design Paradigm and Multimodal Conditional Mechanism

Traditional methods split protein design into two independent stages: sequence design and structure prediction, leading to inconsistencies between sequence and structure. DISCO proposes a co-design paradigm that generates sequences and 3D structures simultaneously, with mutual constraints and optimization: at the sequence level, it considers biochemical properties, evolutionary conservation, etc.; at the structure level, it considers secondary structure arrangement, active site constraints, etc. The core innovation is the multimodal conditional diffusion architecture, which accepts biomolecular conditions (functional domains, binding sites, etc.), ligand conditions (small molecule structures), and text descriptions (e.g., "thermostable esterase") as inputs to guide generation.

4

Section 04

Adaptation Scheme of Diffusion Models in Molecular Space

DISCO adopts specialized adaptations for molecular design challenges: 1. SE(3) equivariance: Using equivariant neural networks to ensure output is coordinate system-independent, avoiding unreasonable conformations; 2. Discrete-continuous hybrid space: Using a unified continuous diffusion framework, projecting sequence probabilities into discrete space during decoding; 3. Multiscale representation: Learning from atomic to residue levels to capture hydrogen bond networks and overall topological folding patterns.

5

Section 05

Application Scenarios and Experimental Validation Results of DISCO

DISCO performs excellently in benchmark tests: In enzyme design, the designed enzymes have activity close to that of natural enzymes; in binding protein design, the success rate for given ligands is more than 3 times higher than traditional methods; in structure filling, it can complete missing regions of known partial structures, which is important for protein engineering and repair research.

6

Section 06

Impact of DISCO on Drug Discovery and Synthetic Biology

In drug discovery, DISCO enables "on-demand design"—de novo design of binding proteins or antibodies for disease targets, shortening response time to emerging infectious diseases; in synthetic biology, it provides component design capabilities for artificial metabolic pathways, allowing the design of specific enzymes to assemble efficient biosynthetic pathways (e.g., converting cheap substrates into high-value chemicals, microbial systems for plastic degradation).

7

Section 07

Technical Limitations and Future Development Directions

DISCO has limitations: It mainly optimizes static structures, and modeling of dynamic conformations and allosteric regulation needs improvement; experimental validation remains a bottleneck. Future directions include integrating molecular dynamics simulations to refine structures, introducing active learning with experimental feedback, and expanding to protein-protein interaction systems. In addition, DISCO's open-source nature lowers the entry barrier for research, promoting open collaboration to accelerate AI biotech innovation.