Reading

AcmGENTIC: An End-to-End Solution for Automatically Mining Functional Evidence of Genomic Variants Using Large Language Models

One of the biggest bottlenecks in clinical genomics is how to convert experimental evidence from massive literature into structured data that can be used for variant pathogenicity interpretation. The AcmGENTIC system introduced in this article achieves full-process automation (including abstract screening, full-text evidence extraction and classification, and evidence summary generation) using LLM, achieving 96% accuracy on the ClinGen benchmark, and provides a scalable technical framework for evidence management in precision medicine.

基因组变异功能证据大语言模型精准医学文献挖掘ClinGenACMG指南临床基因组学

Published 2026-03-31 23:08Recent activity 2026-04-02 09:48Estimated read 6 min

AcmGENTIC: An End-to-End Solution for Automatically Mining Functional Evidence of Genomic Variants Using Large Language Models

Section 01

AcmGENTIC: An End-to-End Solution for Automatically Mining Functional Evidence of Genomic Variants Using LLM (Introduction)

Clinical genomics faces the bottleneck of converting experimental evidence from massive literature into structured data for variant pathogenicity interpretation, with most variants being Variants of Uncertain Significance (VUS). The AcmGENTIC system achieves full-process automation (including abstract screening, full-text evidence extraction and classification, and evidence summary generation) using large language models, achieving 96% accuracy on the ClinGen benchmark, and provides a scalable technical framework for evidence management in precision medicine.

Section 02

Background: Evidence Dilemma in Precision Medicine

In the era of precision medicine, genomic sequencing has become routine, but most variants are VUS, which require integration of multi-dimensional evidence such as functional experiments and population frequency. Functional evidence is scattered across tens of thousands of literatures; manual processing is time-consuming, labor-intensive, and difficult to scale. Traditional literature mining relies on keyword matching, which struggles to handle complex biomedical contexts; LLM applications need to solve the core problems of accurately identifying relevant literature and extracting structured evidence.

Section 03

Research Design: Benchmark Testing Based on ClinGen

A benchmark dataset annotated by ClinGen experts was constructed, extracting PubMed identifiers, evidence labels, etc., to form "variant-literature" pairs. The gpt-4o-mini (non-inference) and o4-mini (inference) models were evaluated, with tasks divided into two stages: abstract screening (judging whether the literature reports functional experiments on specific variants) and full-text evidence extraction and classification (extracting evidence direction, strength, and experiment type).

Section 04

Evidence: Results of Abstract Screening

In abstract screening, both models had high recall rates (0.88-0.90) but low specificity (0.59-0.65). The "better to include than miss" strategy is reasonable: initial screening ensures recall, and subsequent full-text analysis performs fine filtering. Model limitations: it is difficult to judge whether the experiment is truly targeted at the target variant, requiring subsequent verification.

Section 05

Evidence: Advantages of Full-Text Evidence Extraction

After introducing the "variant matching gate", o4-mini performed significantly: evidence classification accuracy of 96%, specificity of 0.83 (gpt-4o-mini only 0.37), and F1 score of 0.98. LLM-as-judge evaluation showed that the summary generated by o4-mini was of higher quality, providing an evaluation framework for model iteration.

Section 06

End-to-End Process of the AcmGENTIC System

The AcmGENTIC process includes: 1. Variant identifier expansion (converting HGVS to multiple forms); 2. Intelligent literature retrieval (obtaining metadata and full text from PubMed, etc.); 3. LLM abstract screening (initial screening with lightweight models); 4. Multimodal evidence extraction (PDF full-text analysis including chart parsing); 5. Structured report generation (for expert review).

Section 07

Technical Insights and Clinical Significance

Technical insights: Adopting a "human-in-the-loop" approach, LLM handles tedious tasks while experts review decisions, leveraging their respective strengths. Clinical significance: Solves the variant annotation pressure brought by the growth of genomic sequencing demand; the human-machine collaboration model balances automation and accuracy, providing feasible ideas for precision medicine.

Section 08

Limitations and Future Directions

Limitations: Training data from ClinGen may have domain bias; only English literature is processed; complex chart parsing needs improvement. Future directions: Expand data to cover more diseases and variants; optimize fine-tuning strategies; enhance chart understanding; establish an expert feedback mechanism for continuous iteration.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15