Zing Forum

Reading

Agricultural Intelligence in Bangladesh: A Two-Stage Machine Learning System for Crop Recommendation and Yield Prediction

A complete end-to-end machine learning solution that provides crop recommendations and yield predictions to farmers in 64 regions of Bangladesh without requiring soil nutrient data, integrating real-time weather APIs and deployed as a mobile-friendly Streamlit application.

机器学习农业作物推荐产量预测孟加拉KNN决策树Streamlit精准农业数据科学
Published 2026-05-23 14:45Recent activity 2026-05-23 14:48Estimated read 7 min
Agricultural Intelligence in Bangladesh: A Two-Stage Machine Learning System for Crop Recommendation and Yield Prediction
1

Section 01

[Introduction] Bangladesh's Two-Stage Agricultural Intelligence System: An Innovative Solution for Crop Recommendation and Yield Prediction

Core Content

A complete end-to-end machine learning solution that provides crop recommendations and yield predictions to farmers in 64 regions of Bangladesh without requiring soil nutrient data, integrating real-time weather APIs and deployed as a mobile-friendly Streamlit application.

Project Source

System Features

Adopts a two-stage architecture (crop recommendation + yield prediction), lightweight model design adapted to resource-constrained environments, and a mobile-first user interface for easy use by farmers.

2

Section 02

Project Background and Core Challenges

In an agricultural country like Bangladesh, farmers face two core problems:

  1. Experience Failure: Climate change has disrupted traditional planting experience, making it difficult to determine the optimal crop choice;
  2. Data Threshold: Most farmers cannot afford soil nutrient testing (nitrogen, phosphorus, potassium, pH value), so models relying on soil data are hard to implement.

This project addresses the constraint of no soil nutrient data and proves that a high-precision prediction system can be built using only geographic location, season, soil type, and climate characteristics.

3

Section 03

Two-Stage System Architecture and Technical Methods

Stage 1: Crop Recommendation (Classification Task)

  • Input: Region, season, soil type, climate conditions (temperature/humidity/rainfall)
  • Model: K-Nearest Neighbors (K=7, distance-weighted), balancing interpretability and efficiency
  • Feature Processing: One-hot encoding for region/season/soil type, standardization for climate features, forming an 83-dimensional feature space

Stage 2: Yield Prediction (Regression Task)

  • Input: Same climate and geographic features
  • Model: Decision Tree Regressor (max depth 25), adapted to non-linear relationships
  • Design: Two models are optimized independently, supporting future expansion of crop types
4

Section 04

Model Performance Comparison and Selection Basis

Crop Recommendation Model Accuracy Comparison

Model Accuracy
KNN (k=7, distance-weighted) 88.27%
Random Forest 86.51%
Gradient Boosting 85.34%

Selection Reason: KNN aligns with the intuition of "similar regions grow similar crops", requires no complex training, and has a small model size suitable for constrained environments.

Yield Prediction Model R² Score Comparison

Model R² Score
Decision Tree Regressor 0.8621
Random Forest Regressor 0.8555
Gradient Boosting Regressor 0.7431

Selection Reason: Decision trees naturally adapt to non-linear threshold effects between yield and climate (e.g., U-shaped impact of rainfall).

5

Section 05

Data Engineering and Deployment Details

Data Processing

  • Source: Official records from Bangladesh's government agricultural department (4608 original data entries, augmented to 200,000)
  • Feature Design: Temperature (average/maximum/minimum), humidity (average/maximum/minimum), rainfall (monthly total)
  • Data Cleaning: Physical plausibility verification to exclude outliers

External Integration

  • Weather API: Open-Meteo (free, no API key)
  • GPS Positioning: Match nearest region via Euclidean distance
  • Soil Type: Built-in region mapping table, supports manual modification

Deployment

  • Framework: Streamlit (declarative programming, quick conversion to interactive applications)
  • Experience: Mobile-first design, sub-second inference latency
  • Hosting: Streamlit Community Cloud free service
6

Section 06

Practical Significance and Key Insights

Practical Value

Provides a low-cost decision tool for smallholder farmers in Bangladesh, helping them choose optimal crops before planting to increase income.

Scalability

  • Architecture can be ported to other developing countries (replace region/soil/climate data)
  • Open-source nature supports local secondary development (add special crops, adjust parameters)

Key Insights

  1. Simple algorithms outperform complex models: KNN and decision trees perform better in appropriate scenarios;
  2. Feature engineering first: 83 carefully designed features are the foundation of performance;
  3. Constraint-driven design: The requirement of no soil data shapes the system architecture;
  4. Deployment experience is key: Ease of use determines the value of technology implementation.