Section 01
ESM-2 Enzyme Family Classification System: Guide to Production-Grade Fine-Tuning Scheme
This project introduces a production-grade fine-tuning system for enzyme family classification based on the ESM-2 protein language model, aiming to address the limitations of traditional sequence alignment methods in classifying distantly homologous proteins. The system integrates key technologies such as LoRA parameter-efficient fine-tuning, homology-aware data splitting, temperature scaling calibration, and integrated gradient interpretability to realize a complete workflow from data processing and model training to production deployment. It provides a reliable solution for enzyme function annotation and can be applied in fields like drug discovery and synthetic biology.