Zing Forum

Reading

DISCO: A Multimodal Protein Co-Design Model Enabling DNA-Encoded Chemistry

DISCO is an innovative multimodal generative model that enables simultaneous co-design of protein sequences and 3D structures. It supports conditional generation with various biomolecules such as small molecules, DNA, and RNA, and demonstrates exceptional performance in enzyme design and drug development.

蛋白质设计扩散模型多模态生成酶工程结构生物学药物研发生物信息学深度学习
Published 2026-05-14 05:40Recent activity 2026-05-14 05:48Estimated read 5 min
DISCO: A Multimodal Protein Co-Design Model Enabling DNA-Encoded Chemistry
1

Section 01

DISCO: Core Introduction to the Multimodal Protein Co-Design Model

DISCO is an innovative multimodal generative model that for the first time enables simultaneous co-design of protein sequences and 3D structures. It supports conditional generation with various biomolecules such as small molecules, DNA, and RNA, demonstrates exceptional performance in enzyme design and drug development, and breaks the limitations of traditional step-by-step design strategies.

2

Section 02

Limitations of Traditional Protein Design and the Background of DISCO's Breakthrough

Traditional protein design uses a step-by-step strategy: first generate the backbone structure, then predict the amino acid sequence, which makes it difficult to ensure the optimal match between sequence and structure. The emergence of DISCO breaks this limitation, enabling co-design of sequence and structure and opening up new possibilities for biomolecular engineering.

3

Section 03

Technical Architecture and Implementation Details of DISCO

Based on a diffusion model architecture, it uses the Hydra configuration system to manage parameters; provides two experimental presets: designable (entropy-adaptive temperature scaling + dual-modal noise guidance, high designability but low diversity), diverse (free sampling, high diversity but low designability); two computational intensity modes: fast (100 diffusion steps + 2 cycles, 4x speed improvement), max (200 diffusion steps +4 cycles, full quality); memory optimization uses DeepSpeed4Science EvoformerAttention, supports long sequences, requires Ampere or higher GPUs, and AMD users need to adjust dependencies.

4

Section 04

Experimental Evidence and Performance of DISCO

In the Studio-179 benchmark (179 ligands), it achieved the best performance in 178 out of 179 metrics; Catalytic enzyme design: Inputting reaction intermediates generates heme enzymes with novel active sites, catalyzing non-natural carbene transfer reactions (such as olefin cyclopropanation, etc.). Top designs have higher activity than engineered natural enzymes, and activity increases 4-fold after mutagenesis; Evaluation criteria for co-design ability: RMSD between backbone and ligand centroid after refolding <2 Å.

5

Section 05

Application Scenarios and Experimental Reproducibility Support of DISCO

Supports unconditional generation (70-300 residues), ligand-conditional generation (e.g., heme B, warfarin, etc.), and nucleic acid-conditional generation (DNA/RNA complexes); All experimental samples and results are available on Hugging Face for easy reproduction and verification.

6

Section 06

Scientific Significance and Future Outlook of DISCO

Marks the paradigm shift in protein design from "structure-first" to "sequence-structure co-design"; Improves design success rate and expands functional spaces such as novel catalytic activities; Has important applications in drug development (targeted ligand design) and synthetic biology (new biocatalytic systems); Looks forward to model optimization and community benchmark advancement bringing a new round of breakthroughs.