Zing Forum

Reading

Land Use and Land Cover Classification in Odisha Coastal Zone: A Remote Sensing Big Data Processing Practice Using Sentinel-2 and Random Forest

An in-depth analysis of the LULC_ODISHA project, a high-throughput geospatial data science pipeline that processes over 47 million spatial pixels covering the 47,054 km² Odisha coastal zone, using Sentinel-2 imagery and balanced random forest algorithm for automated land use and land cover classification.

遥感土地利用分类Sentinel-2随机森林地理空间数据奥里萨邦海岸带监测机器学习分块处理QGIS
Published 2026-06-10 13:15Recent activity 2026-06-10 13:19Estimated read 8 min
Land Use and Land Cover Classification in Odisha Coastal Zone: A Remote Sensing Big Data Processing Practice Using Sentinel-2 and Random Forest
1

Section 01

[Introduction] Core Overview of the Odisha Coastal Zone LULC Classification Project

This article introduces the LULC_ODISHA project, which targets the coastal zone of Odisha, India (47,054 km²), using Sentinel-2 remote sensing imagery and balanced random forest algorithm to achieve automated land use and land cover classification, processing over 47 million spatial pixels. Developed by GOURGOPAL618, the project is open-sourced on GitHub (link: https://github.com/GOURGOPAL618/LULC_ODISHA) and was released on June 10, 2026. Its technical pipeline provides a reference example for large-scale remote sensing data processing.

2

Section 02

Project Background and Significance

Land Use and Land Cover (LULC) classification is a core application of remote sensing, crucial for environmental monitoring, urban planning, agricultural management, and ecological protection—especially in ecologically sensitive coastal zones. The coastal zone of Odisha, India, faces pressures such as urbanization expansion, agricultural land changes, and wetland degradation, which urgently require high-precision LULC data to support scientific decision-making. The LULC_ODISHA project develops a high-throughput geospatial processing pipeline, providing a reference for large-scale remote sensing processing.

3

Section 03

Data Foundation and Processing Scale

Data Source: Uses the European Space Agency (ESA) Sentinel-2 Multispectral Instrument (MSI) Level-2A surface reflectance products, with advantages including:

  • Spatial resolution: 10m (visible/near-infrared), 20m (shortwave infrared), 60m (atmospheric correction)
  • Revisit cycle: 5 days (dual-satellite network)
  • Spectral coverage: 13 bands (visible, near-infrared, shortwave infrared)

Processing Scale:

  • Spatial coverage: Approximately 47,054.3 km²
  • Number of pixels: Over 47 million (original matrix: 25604×18412)
  • Grid resolution: 10m×10m

The project addresses large-scale data challenges through an innovative block processing strategy.

4

Section 04

Technical Architecture and Core Algorithm

Classification Engine: Balanced random forest algorithm, featuring anti-overfitting, support for feature importance evaluation, and optimization of class imbalance issues.

Block Processing Strategy:

  1. Grid blocking: Divide the original image into 512×512 pixel tiles to balance computational efficiency and memory usage;
  2. NoData optimization: Discard deep sea/unmapped areas via np.all(block == 0) to reduce computational load by half;
  3. NVMe cache: Write prediction results to high-speed storage in pages to avoid memory overflow.
5

Section 05

Spectral Features and Classification Results

Band Applications:

  • Visible bands (B2 blue, B3 green, B4 red): Identify built-up areas, sandy formations, and coastal urban layouts;
  • Near-infrared band (B8 NIR): Capture chlorophyll absorption peaks to identify delta mangrove groups;
  • Shortwave infrared band (B11 SWIR): Distinguish muddy waterways, wet farmland, and exposed riverbeds.

Classification Results: A total of 8 land cover types, with area statistics as follows:

LULC Category Area (km²) Coverage
Bay of Bengal/Deep Water 17339.95 36.85%
Paddy Fields/Arable Land 12976.33 27.58%
Agricultural Wetlands/Mixed Areas 12784.59 27.17%
Rivers and Inland Waterways 1527.20 3.25%
Coastal Beaches/Spits 1277.09 2.71%
Mangroves/Dense Coastal Forests 796.56 1.69%
Residential Areas/Urban Built-up Zones 283.14 0.60%
Fallow Land/Open Areas 69.43 0.15%

The region is dominated by water bodies (36.85%) and agricultural land (54.75%), with small proportions of natural vegetation and built-up areas.

6

Section 06

Accuracy Verification and Quality Control

QGIS Field Verification: Classification results are verified via QGIS and real-time Earth observation basemaps; the boundary division of Chilika Lake (India's largest brackish lagoon) highly matches the actual terrain.

Data Governance: Includes DATA_GOVERNANCE.md recording sensor radiometric calibration traceability, and stac_catalog.json providing spatio-temporal database specifications to ensure data quality and reproducibility.

7

Section 07

Application Value and Future Outlook

Technical Insights: Memory management (blocking + NVMe cache), computational optimization (NoData filtering), band selection, and verification strategy.

Regional Applications: Supports coastal zone management (beach erosion monitoring), agricultural planning (arable land distribution), ecological protection (mangrove monitoring), disaster assessment (flood mapping), and urban planning (built-up area expansion).

Outlook: The project framework can be ported to other coastal zones; open-source sharing promotes the development of remote sensing applications, providing an excellent practice example for environmental monitoring and other fields.