# AcmGENTIC: An End-to-End Solution for Automatically Mining Functional Evidence of Genomic Variants Using Large Language Models

> One of the biggest bottlenecks in clinical genomics is how to convert experimental evidence from massive literature into structured data that can be used for variant pathogenicity interpretation. The AcmGENTIC system introduced in this article achieves full-process automation (including abstract screening, full-text evidence extraction and classification, and evidence summary generation) using LLM, achieving 96% accuracy on the ClinGen benchmark, and provides a scalable technical framework for evidence management in precision medicine.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-03-31T15:08:37.000Z
- 最近活动: 2026-04-02T01:48:00.446Z
- 热度: 125.3
- 关键词: 基因组变异, 功能证据, 大语言模型, 精准医学, 文献挖掘, ClinGen, ACMG指南, 临床基因组学
- 页面链接: https://www.zingnex.cn/en/forum/thread/acmgentic
- Canonical: https://www.zingnex.cn/forum/thread/acmgentic
- Markdown 来源: floors_fallback

---

## AcmGENTIC: An End-to-End Solution for Automatically Mining Functional Evidence of Genomic Variants Using LLM (Introduction)

Clinical genomics faces the bottleneck of converting experimental evidence from massive literature into structured data for variant pathogenicity interpretation, with most variants being Variants of Uncertain Significance (VUS). The AcmGENTIC system achieves full-process automation (including abstract screening, full-text evidence extraction and classification, and evidence summary generation) using large language models, achieving 96% accuracy on the ClinGen benchmark, and provides a scalable technical framework for evidence management in precision medicine.

## Background: Evidence Dilemma in Precision Medicine

In the era of precision medicine, genomic sequencing has become routine, but most variants are VUS, which require integration of multi-dimensional evidence such as functional experiments and population frequency. Functional evidence is scattered across tens of thousands of literatures; manual processing is time-consuming, labor-intensive, and difficult to scale. Traditional literature mining relies on keyword matching, which struggles to handle complex biomedical contexts; LLM applications need to solve the core problems of accurately identifying relevant literature and extracting structured evidence.

## Research Design: Benchmark Testing Based on ClinGen

A benchmark dataset annotated by ClinGen experts was constructed, extracting PubMed identifiers, evidence labels, etc., to form "variant-literature" pairs. The gpt-4o-mini (non-inference) and o4-mini (inference) models were evaluated, with tasks divided into two stages: abstract screening (judging whether the literature reports functional experiments on specific variants) and full-text evidence extraction and classification (extracting evidence direction, strength, and experiment type).

## Evidence: Results of Abstract Screening

In abstract screening, both models had high recall rates (0.88-0.90) but low specificity (0.59-0.65). The "better to include than miss" strategy is reasonable: initial screening ensures recall, and subsequent full-text analysis performs fine filtering. Model limitations: it is difficult to judge whether the experiment is truly targeted at the target variant, requiring subsequent verification.

## Evidence: Advantages of Full-Text Evidence Extraction

After introducing the "variant matching gate", o4-mini performed significantly: evidence classification accuracy of 96%, specificity of 0.83 (gpt-4o-mini only 0.37), and F1 score of 0.98. LLM-as-judge evaluation showed that the summary generated by o4-mini was of higher quality, providing an evaluation framework for model iteration.

## End-to-End Process of the AcmGENTIC System

The AcmGENTIC process includes: 1. Variant identifier expansion (converting HGVS to multiple forms); 2. Intelligent literature retrieval (obtaining metadata and full text from PubMed, etc.); 3. LLM abstract screening (initial screening with lightweight models); 4. Multimodal evidence extraction (PDF full-text analysis including chart parsing); 5. Structured report generation (for expert review).

## Technical Insights and Clinical Significance

Technical insights: Adopting a "human-in-the-loop" approach, LLM handles tedious tasks while experts review decisions, leveraging their respective strengths. Clinical significance: Solves the variant annotation pressure brought by the growth of genomic sequencing demand; the human-machine collaboration model balances automation and accuracy, providing feasible ideas for precision medicine.

## Limitations and Future Directions

Limitations: Training data from ClinGen may have domain bias; only English literature is processed; complex chart parsing needs improvement. Future directions: Expand data to cover more diseases and variants; optimize fine-tuning strategies; enhance chart understanding; establish an expert feedback mechanism for continuous iteration.
