Zing Forum

Reading

Intelligent Analysis Agent for Rock Thin Sections: A Multimodal Large Model-Driven Geological Mineral Identification System

This is an intelligent geological analysis system based on a multimodal large language model. Driven by natural language dialogue, the Agent autonomously calls image analysis tools to realize automatic mineral classification of rock thin sections, intelligent ooid segmentation, and professional report generation. The system adopts a pure front-end architecture, supports GitHub Pages deployment, and can run without a back-end server.

多模态大模型岩石薄片分析矿物分类地质 AIAgent 架构Function CallingRAG纯前端部署MiMo地质智能化
Published 2026-05-07 22:35Recent activity 2026-05-07 22:50Estimated read 5 min
Intelligent Analysis Agent for Rock Thin Sections: A Multimodal Large Model-Driven Geological Mineral Identification System
1

Section 01

Introduction to the Intelligent Analysis Agent for Rock Thin Sections: A Multimodal Large Model-Driven Geological Mineral Identification System

This is an intelligent geological analysis system based on a multimodal large language model. Driven by natural language dialogue, the Agent autonomously calls image analysis tools to realize automatic mineral classification of rock thin sections, intelligent ooid segmentation, and professional report generation. The system adopts a pure front-end architecture, supports GitHub Pages deployment, and can run without a back-end server.

2

Section 02

Project Background and Core Issues

Rock thin section analysis is a core task in the geological field. The traditional process relies on professional experience and is low in efficiency. Beginners or field workers find it difficult to identify quickly and accurately, and experienced personnel also feel burdened when dealing with a large number of samples. Therefore, using AI assistance has become an important research direction in geological informatization.

3

Section 03

System Architecture and Design Philosophy

It adopts an Agent architecture, with the MiMo-v2.5 large language model as the core, which autonomously calls tools through Function Calling. The pure front-end architecture is based on React18.3 + TypeScript5.6 + Vite5.4, supports GitHub Pages deployment, and has a built-in Mock server that can demonstrate all functions.

4

Section 04

Analysis of Core Functional Modules

  1. Automatic Mineral Classification: Integrates deep learning models to identify mineral types and provide confidence levels; 2. Intelligent Ooid Segmentation: Detects ooids in sedimentary rocks and counts their quantity and area proportion; 3. Knowledge Base Retrieval: Contains 53 pieces of professional knowledge, realizing client-side RAG based on Fuse.js fuzzy search; 4. Intelligent Report Generation: Synthesizes multi-source information to generate structured Markdown reports with streaming output.
5

Section 05

Technical Implementation Details

  • Agent Orchestrator: Implements the Agentic Loop mechanism for multi-round reasoning cycles; - Three-level Degradation Strategy: Agentic mode (LLM autonomous calling), keyword intent fallback, pure template report; - Multimodal Vision: MiMo-v2.5 can analyze image visual features; - Memory Module: Automatic summarization when dialogue memory exceeds 20 entries, LRU caching of image analysis results to localStorage.
6

Section 06

Deployment and Usage Methods

Easy Deployment: Static files can be hosted on GitHub Pages, with automatic deployment via GitHub Actions. Usage Process: Configure LLM service (supports OpenAI-compatible API) → Upload thin section images → Ask questions via natural language dialogue (e.g., analyze minerals, count ooid content, etc.).

7

Section 07

Innovation Points and Application Value

Innovation Points: Combining multimodal LLM with geological knowledge, and using the Agent architecture to realize flexible and intelligent analysis. Application Value: Helps students learn mineral identification, assists in preliminary field screening, and provides new ideas for the digitization of geological data.

8

Section 08

Limitations and Future Prospects

Limitations: Insufficient coverage of training data leads to inaccurate identification of rare minerals; pure front-end relies on network environment; knowledge base needs expansion. Future Directions: Expand mineral/rock types, integrate professional databases, support geochemical analysis, and develop offline reasoning capabilities.