Zing Forum

Reading

MAPLE: Automating QSP Model Parameter Extraction from Literature Using Large Language Models

A structured pipeline tool that uses LLMs to extract quantitative pharmacology parameters from scientific literature, generates informative prior distributions via Bayesian inference, and addresses data challenges in QSP model calibration.

定量系统药理学QSP模型文献挖掘贝叶斯推理NumPyro参数校准LLM应用药物研发
Published 2026-04-09 06:14Recent activity 2026-04-09 06:19Estimated read 6 min
MAPLE: Automating QSP Model Parameter Extraction from Literature Using Large Language Models
1

Section 01

Introduction: MAPLE—An LLM-Driven Tool for Automated QSP Model Parameter Extraction

MAPLE is a structured pipeline tool that uses Large Language Models (LLMs) to extract quantitative pharmacology parameters from scientific literature. It generates informative prior distributions via Bayesian inference, addressing challenges like scattered data and heterogeneous sources in Quantitative Systems Pharmacology (QSP) model calibration, and supports parameter extraction and model building in drug development.

2

Section 02

Project Background and Core Issues

QSP models contain numerous biological parameters that cannot be directly measured clinically. Relevant data are scattered across hundreds of literatures, with sources covering different species and indications, diverse formats, and difficulties in conversion. Traditional manual processing is error-prone; MAPLE achieves automated standardization through LLM-assisted extraction and statistical inference.

3

Section 03

Core Methods and Architecture Design

MAPLE uses a two-stage calibration pipeline:

  1. Literature extraction and validation (LLM + Pydantic to generate YAML files) + joint MCMC inference to generate sub-model priors
  2. SBI inference combining clinical data and QSP simulators The innovation lies in quantifying data source quality (evaluated via 8 dimensions) and adjusting data source weights through translation sigma—for example, mouse data has lower weight than human clinical data.
4

Section 04

Technical Implementation Details

  • YAML Structure: Structured association between literature measurements and model parameters, including target ID, input, calibration rules, source relevance, etc.
  • Forward Model Types: Supports multiple types like algebraic formulas, dose-response fitting, ODE systems, etc.
  • Nuisance Parameter Handling: Mark additional parameters as nuisance; estimate them in MCMC but exclude from final output.
  • Batch Pipeline: Stages like literature search, PDF collection, evaluation, extraction, validation; supports caching mechanism.
  • Input Format: Target parameter CSV must include ID, parameter, cancer type, and search annotations (e.g., specific keywords).
5

Section 05

Usage Methods

  • Coding Assistant Collaboration: Collaborate with coding assistants via MCP protocol to call tools like Claude/Codex for automatic literature search, extraction, and YAML validation; users are responsible for review.
  • Python API: Call the process_targets function to handle priors_csv and yaml files.
  • Batch Extraction: Process large-scale parameters in parallel stages; results of each stage are independently cached.
  • Best Practices: Annotation fields should include rate formulas and specific search terms (e.g., "MVD growth kinetics") to improve search efficiency.
6

Section 06

Application Scenarios and Value

Applicable to:

  1. QSP model building for new drug development
  2. Model recalibration (updating parameters with new data)
  3. Cross-species/indication model transfer
  4. Regulatory submission support (parameter traceability and uncertainty quantification) Significantly lowers the threshold for QSP model calibration, enabling more teams to build high-quality models.
7

Section 07

Summary and Outlook

MAPLE uses LLMs as information extraction tools to assist scientists in validation and interpretation work, integrating literature extraction, statistical inference, and uncertainty quantification into a standardized pipeline. With the improvement of LLM capabilities in the future, such tools will play a more important role in biomedical research.