Zing Forum

Reading

DISCO: Multimodal Protein Design Enables DNA-Encoded Chemistry

DISCO is a multimodal diffusion model that can collaboratively design protein sequences and 3D structures, creating entirely new enzymes without predefining catalytic residues to catalyze efficient chemical reactions not found in nature.

蛋白质设计酶工程扩散模型多模态学习合成生物学卡宾转移定向进化
Published 2026-04-07 05:21Recent activity 2026-04-08 11:22Estimated read 5 min
DISCO: Multimodal Protein Design Enables DNA-Encoded Chemistry
1

Section 01

[Introduction] DISCO: Multimodal Protein Design Breaks Through the Bottleneck of De Novo Enzyme Design

DISCO is a multimodal diffusion model that can collaboratively design protein sequences and 3D structures. It creates entirely new enzymes without predefining catalytic residues, catalyzing efficient chemical reactions not found in nature (e.g., carbene transfer reactions) with activity even exceeding that of artificially engineered enzymes. This study breaks the limitations of traditional protein design, provides a new approach for the scalable design of evolvable enzymes, and greatly expands the potential range of genetically encodable chemical transformations.

2

Section 02

[Background] Challenges in Protein Design and Limitations of Existing Methods

Enzymes are molecular machines that efficiently catalyze chemical reactions in organisms, but the chemical space explored by natural evolution is limited. The ultimate goal of protein design is to create enzymes de novo, but existing deep generative models require pre-specification of catalytic residues, limiting design freedom, requiring researchers to have in-depth knowledge of reaction mechanisms, and failing to automatically discover appropriate catalytic residue configurations.

3

Section 03

[Methodology] DISCO's Multimodal Sequence-Structure Collaborative Design Framework

The core innovation of DISCO lies in its multimodal design, which simultaneously handles protein sequences (1D amino acid strings) and structures (3D atomic coordinates). Based on a diffusion model architecture, it generates reasonable sequence-structure pairs through step-by-step denoising. A key technology is the inference-time scaling method, which enables cross-modal optimization for specific targets (e.g., active site geometry) and addresses the computational challenges posed by the dimensional differences between sequence and structure spaces.

4

Section 04

[Evidence] Functional and Evolvability Validation of DISCO-Designed Enzymes

DISCO can design heme enzymes based solely on reaction intermediates, catalyzing carbene transfer reactions not found in nature (e.g., olefin cyclopropanation, spirocyclopropanation) with activity exceeding that of enzymes optimized via artificial directed evolution. Random mutation experiments confirm that the designed enzymes are evolvable; their activity is further improved after directed evolution, indicating the practical value of the design.

5

Section 05

[Conclusion] Significance and Application Prospects of DISCO

DISCO represents a milestone in AI-driven protein design, expanding the genetically encodable chemical space. Its application prospects include drug manufacturing (synthesis of complex intermediates), biofuel production (catalyzing hard-to-convert steps), materials science (green polymer synthesis), etc., opening up new possibilities for fields like synthetic biology.

6

Section 06

[Open Source & Future] DISCO's Open Source and Future Challenges in Protein Design

The DISCO code has been open-sourced (https://github.com/DISCO-design/DISCO) to promote reproducibility and community development. Future challenges include experimental validation bottlenecks, improving design success rates, and handling complex reactions; research directions include expanding enzyme types, integrating active learning with experimental feedback, and combining synthetic biology tools.