# AgriChain: An Expert-Verified Reasoning Dataset for Interpretable Agricultural Vision-Language Models

> This article introduces the AgriChain dataset, which contains approximately 11,000 expert-curated plant leaf images. Each image is accompanied by a disease label, confidence score, and expert-verified chain-of-thought reasoning. The AgriChain-VL3B model fine-tuned on this dataset outperforms strong baselines like Gemini and GPT-4o in plant disease diagnosis.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-09T05:13:37.000Z
- 最近活动: 2026-04-10T02:23:56.690Z
- 热度: 120.8
- 关键词: AgriChain, 农业视觉语言模型, 植物病害诊断, 思维链推理, 可解释AI, 专家验证, 可持续农业, 领域专门化
- 页面链接: https://www.zingnex.cn/en/forum/thread/agrichain
- Canonical: https://www.zingnex.cn/forum/thread/agrichain
- Markdown 来源: floors_fallback

---

## AgriChain Dataset: An Expert-Verified Reasoning Resource for Interpretable Agricultural Vision-Language Models

This article introduces the AgriChain dataset, which includes approximately 11,000 expert-curated plant leaf images. Each image is equipped with a disease label, confidence score, and expert-verified chain-of-thought reasoning. The AgriChain-VL3B model fine-tuned on this dataset outperforms strong baselines such as Gemini and GPT-4o in plant disease diagnosis, providing critical support for interpretable agricultural AI.

## Core Challenges Facing Agricultural AI: Accuracy and Interpretability

Globally, 20%-40% of crop yields are lost to pests and diseases each year, and professional pathologists are scarce. General-purpose Vision-Language Models (VLMs) have two major issues in agricultural applications: 1. Lack of specialized training for agricultural scenarios, making it difficult to identify subtle disease features; 2. Black-box predictions lack interpretability, making it hard to gain farmers' trust.

## AgriChain Dataset Construction and Model Fine-Tuning Methods

The AgriChain dataset contains 11,000 leaf images, with annotations including disease labels, calibrated confidence scores, and expert-verified chain-of-thought reasoning. Annotation generation uses human-machine collaboration: GPT-4o generates drafts, which are reviewed and revised by agricultural engineers to ensure professional consistency. The AgriChain-VL3B model, fine-tuned on Qwen2.5-VL-3B, performs disease classification and reasoning generation simultaneously through multi-task learning, improving accuracy and interpretability.

## Experimental Results: AgriChain-VL3B Outperforms General-Purpose Large Models and Has High Interpretability

On the test set, AgriChain-VL3B achieved a Top-1 accuracy of 73.1%, a macro-average F1 score of 0.466, and a weighted F1 score of 0.655, significantly outperforming general-purpose models like Gemini 1.5 Flash, Gemini 2.5 Pro, and GPT-4o Mini. The reasoning explanations it generates are highly aligned with expert reasoning, stably citing key visual clues, and have both credibility and educational value.

## Technical Contributions and Significance for Sustainable Agriculture

Technical contributions: 1. One of the first large-scale agricultural VL datasets with expert chain-of-thought annotations; 2. The human-machine collaborative annotation process enables scalable domain knowledge acquisition; 3. Proves that specialized fine-tuning can outperform general-purpose large models. Significance for sustainable agriculture: Reduces pesticide abuse, promotes agricultural knowledge dissemination, lowers AI entry barriers, and drives technology inclusivity.

## Current Limitations and Future Research Directions

Limitations: The dataset is mainly leaf-focused, with limited coverage of crop parts; it is highly targeted to specific regional climates, and cross-regional generalization needs to be verified. Future directions: Expand the dataset to more crop parts and types; integrate multi-modal data (images + sensors + meteorology); develop mobile applications for farmers' convenience.
