System Architecture Design
AgriSense adopts a modular three-layer architecture design, where each layer is responsible for different functions and collaborates through clear interfaces:
Visual Recognition Layer: ResNet50-based CNN Model
The system uses a ResNet50 model fine-tuned on the PlantVillage dataset for plant disease classification. PlantVillage is a public dataset containing tens of thousands of plant leaf images, covering common disease types of various crops. After training on a large amount of labeled data, the model can accurately identify lesion features on leaves and output disease category predictions.
As a classic deep residual network, ResNet50 performs stably in image classification tasks; its residual connection design effectively solves the gradient vanishing problem of deep networks, allowing the model to maintain high accuracy even in fine-grained classification tasks like agricultural images.
Knowledge Retrieval Layer: TF-IDF-Driven RAG Architecture
Pure visual recognition can only tell users "what disease this is", but farmers need to know "how to treat it" more. To this end, the system introduces the Retrieval-Augmented Generation (RAG) architecture, combining the generative ability of large language models with domain knowledge bases.
The knowledge base uses Markdown and plain text formats to store professional content related to agricultural disease management, including:
- Detailed disease descriptions (symptoms, causal factors, susceptible crops)
- Treatment plans (chemical and organic control methods)
- Pesticide use guidelines (dosage, application timing, precautions)
- Crop cultivation management recommendations
The system uses the TF-IDF algorithm to vectorize and index knowledge base documents. When a user queries, it retrieves the top-k most relevant text fragments by calculating the similarity between the query and document blocks. These fragments are injected as context information into the subsequent large language model generation process, ensuring that the output treatment recommendations are evidence-based and effectively reducing the risk of model hallucinations.
Intelligent Decision Layer: Plan-Draft-Reflect Three-Stage Workflow
This is the most innovative design of AgriSense. Instead of being satisfied with a simple single retrieval-generation process, the system introduces the Agentic Workflow mode, gradually optimizing output quality through three stages: Plan→Draft→Reflect:
Plan Stage: The agent first analyzes the user's query and decomposes it into a structured diagnosis strategy. For example, for a question like "Why are my tomato leaves turning yellow and curling?", the system will plan a diagnosis path: "Identify symptom features → Match possible causes → Recommend verification methods → Provide preliminary suggestions".
Draft Stage: Based on the strategy determined in the Plan stage and the relevant knowledge fragments retrieved by RAG, a preliminary consultation response is generated. The response will automatically cite relevant paragraphs from the knowledge base to enhance credibility.
Reflect Stage: The agent self-reviews the output from the Draft stage, checking for factual errors, logical loopholes, or missing key information. If problems are found, a revision mechanism is triggered to regenerate a more accurate response. Although this self-reflection process adds about 1-2 seconds of delay, it can significantly improve the factual accuracy of the answers.