# Innovative Application Research of Open-Source Large Language Models in Software Metadata Entity Disambiguation

> This article introduces a study that uses open-source large language models to solve the problem of software metadata entity disambiguation. By constructing a multi-annotator benchmark dataset, the study compares three reasoning strategies—direct prompting, self-consistency, and multi-step agent-based reasoning—and explores feasible paths to achieve high-precision entity resolution in noisy and heterogeneous data environments.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-14T18:09:21.000Z
- 最近活动: 2026-05-14T18:17:25.221Z
- 热度: 141.9
- 关键词: 大语言模型, 实体消歧, 元数据治理, 开源模型, 软件知识图谱, 推理策略, 实体解析, 数据质量
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-github-evamart-semantic-disambiguation-llms
- Canonical: https://www.zingnex.cn/forum/thread/llm-github-evamart-semantic-disambiguation-llms
- Markdown 来源: floors_fallback

---

## Innovative Application Research of Open-Source Large Language Models in Software Metadata Entity Disambiguation (Introduction)

This article introduces a study that uses open-source large language models to solve the problem of software metadata entity disambiguation. Key contents include: constructing a multi-annotator benchmark dataset, comparing three reasoning strategies (direct prompting, self-consistency, multi-step agent-based), exploring feasible paths for high-precision entity resolution in noisy and heterogeneous data environments, and providing academic institutions and enterprises with new reproducible and controllable data governance ideas.

## Research Background and Challenges

In the scientific research software ecosystem, metadata has quality issues such as naming ambiguity, version confusion, and inconsistent descriptions, making entity disambiguation tasks extremely challenging. Traditional rule-matching methods are ineffective against heterogeneous noisy data, and commercial API solutions are costly and have privacy concerns. Therefore, the EvaMart team explores local deployment of open-source large language models to achieve reliable entity disambiguation.

## Task Definition and Dataset Construction

**Task Definition**: Formalize entity disambiguation as a ternary classification problem (same software/different software/insufficient evidence). The input contains multi-modal evidence (name, description, webpage, code repository, etc.), and the output is a structured result with confidence and reasoning basis.

**Dataset**: Construct approximately 1000 real cases (from OpenEBench), adopt a multi-annotator mechanism and calculate Cohen's Kappa to ensure quality, and also set up a balanced subset to handle class imbalance.

## Comparison of Three Reasoning Strategies

The study compares three strategies:
1. **Direct Prompting**: Input all information at once, lowest cost but unstable performance in complex cases;
2. **Self-Consistency**: Majority voting after multiple sampling inferences, improves reliability but increases computational cost;
3. **Agent-Based Multi-Step**: Simulate human reasoning process (evidence extraction → diagnosis → targeted retrieval → decision → verification), strong ability to handle complex cases but with the most calls (5-6 times), and triggers targeted retrieval instead of guessing when evidence is insufficient.

## Experimental Design and Engineering Practice

**Experiment**: Deploy open-source models locally on HPC with no commercial API dependencies; record all parameters through configuration files, and generate manifest.json after running to save environment information to ensure reproducibility.

**Engineering**: The data directory is immutable; model outputs are stored in an independent runs directory; prompt versions are managed via file names; the code is environment-independent and can be tested locally or deployed on HPC.

## Result Analysis and Future Outlook

**Results**: Direct prompting has the lowest cost but limited accuracy; self-consistency exchanges moderate cost for performance improvement; the agent-based strategy has significant advantages in complex cases; an uncertainty-aware scheme (automatic decision for high confidence, manual review otherwise) is proposed to balance quality and efficiency.

**Significance**: Prove that open-source models can achieve commercial API-level effects in specific tasks while maintaining data sovereignty; provide a technical framework and benchmark dataset for the construction of scientific research software knowledge graphs.

**Outlook**: Explore more efficient reasoning strategies, expand task types, and refine uncertainty quantification methods.
