# Stellar LLM Classifier: An Intelligent Star Classification System Combining Astrophysical Rules and Large Language Models

> An innovative hybrid architecture astronomical tool that combines deterministic hard computing with the AstroSage-8B large language model to enable automatic classification of Gaia DR3 star spectral types and generation of natural language descriptions.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-07T21:12:03.000Z
- 最近活动: 2026-06-07T21:26:40.263Z
- 热度: 154.8
- 关键词: 恒星分类, 大语言模型, 天体物理, Gaia DR3, AstroSage-8B, 混合计算, 光谱类型, 机器学习, 天文AI, 自然语言生成
- 页面链接: https://www.zingnex.cn/en/forum/thread/stellar-llm-classifier-b59bc222
- Canonical: https://www.zingnex.cn/forum/thread/stellar-llm-classifier-b59bc222
- Markdown 来源: floors_fallback

---

## Introduction: Stellar LLM Classifier—A Star Classification Tool Fusing Astrophysics and Large Language Models

Stellar LLM Classifier is an innovative intelligent star classification tool. Its core lies in adopting a hybrid architecture that combines traditional astrophysical deterministic computing (hard computing) with the AstroSage-8B large language model to achieve automatic classification of Gaia DR3 star spectral types and generation of natural language descriptions. The tool supports local operation to ensure data privacy and security, aiming to allow astronomy enthusiasts and researchers without programming backgrounds to easily use advanced classification technology.

## Project Background and Source

- Original author/maintainer: bennylimpid196
- Source platform: GitHub
- Original title: stellar-llm-classifier
- Release date: June 3, 2026
- Last update: June 7, 2026
- Project purpose: To enable astronomy enthusiasts and researchers without programming backgrounds to easily use advanced star classification technology to process the Gaia DR3 dataset.

## Core Technology and Method Architecture

### Hybrid Computing Paradigm
**Hard Computing Layer**: Based on astrophysical rules (such as absolute magnitude, effective temperature, surface gravity), it performs standardized classification according to the Morgan-Keenan (MK) spectral classification system, ensuring physical interpretability and rigor.
**Soft Computing Layer**: Calls the fine-tuned AstroSage-8B large language model (8 billion parameters, trained on astronomical literature) to convert technical classification data into professional natural language descriptions.

### Data Processing Flow
1. Data Import: Upload a CSV file containing Gaia DR3 data
2. Parameter Validation: Check required fields such as absolute magnitude, effective temperature, and surface gravity
3. Rule-based Classification: The hard computing layer determines the MK spectral type
4. Intelligent Description: The soft computing layer generates natural language descriptions
5. Result Export: Output a report containing spectral types and descriptions

## Model Performance Verification Results

### Core Metrics for Version V6 (Test Set: 498 Gaia DR3 Stars)
| Metric | Value |
|--------|-------|
| Accuracy | 0.7579 |
| Cohen's Kappa Coefficient | 0.7083 |
| Macro-average F1 Score | 0.6710 |
| Near Misses (Distance 1) | 0.9976 |
| Mean Absolute Error (ΔTeff) | 248.0 K |

### Confidence Interval and Version Evolution
- Bootstrap 95% confidence interval error range: [0.135, 0.205], with good stability
- Version Iteration: From V1 to V7, system prompts and verification strategies were optimized; V7's accuracy increased to 0.7951, and the mean absolute error decreased to 212.2 K

## Application Scenarios and Usage Guide

### Target Users
- Astronomy Enthusiasts: Analyze star data without programming knowledge
- Educators: Demonstrate star classification concepts in teaching
- Researchers: Batch process the Gaia DR3 dataset
- Data Scientists: Explore the combination of astronomical data and natural language generation

### Usage Steps
1. Download the installer (.exe) from GitHub Releases
2. Run the installation wizard to complete the installation
3. Launch the application and import a CSV file in the correct format
4. Select "Classify stars" to start classification
5. Export the results to a spreadsheet

### System Requirements
- OS: Windows 10/11
- Processor: Intel Core i5/AMD Ryzen 5 (4 cores or more)
- Memory: 8GB (16GB recommended)
- Storage: 5GB of available space
- Graphics Card: Discrete graphics card optional (improves performance)

## Project Significance and Technical Insights

### Scientific Value
Represents an important attempt in the field of astronomical data processing: integrating traditional deterministic algorithms with generative AI, which not only retains the rigor of physical rules but also leverages the expressive power of LLMs, providing new ideas for scientific data visualization and dissemination.

### Technical Insights
- Deterministic + Generative: Balances scientific accuracy and user-friendly output
- Local Processing: Protects sensitive data and supports offline work
- Domain Fine-tuning: General LLMs can significantly improve performance in specific domains after fine-tuning with professional corpora

## Limitations and Improvement Suggestions

### Current Limitations
- Primarily optimized for Gaia DR3 data; results may be inconsistent when using other data sources

### Improvement Directions
- Expand support for more data sources
- Further optimize the model's classification accuracy for edge spectral types
