Zing Forum

Reading

Small Model, Big Impact: Practice of a Math Tutoring Agent Based on Code Reasoning

Can a math tutoring assistant be built using a small language model (SLM) with only 1.5 billion parameters? Through efficient fine-tuning with Unsloth, code generation verification, and the LangChain agent architecture, this project proves that SLMs can also achieve reliable mathematical reasoning, providing a feasible path for low-cost deployment of educational AI.

小型语言模型数学推理教育AIUnslothQLoRALangChain代码生成智能体GSM8K
Published 2026-03-28 21:14Recent activity 2026-03-28 21:20Estimated read 5 min
Small Model, Big Impact: Practice of a Math Tutoring Agent Based on Code Reasoning
1

Section 01

Small Model, Big Impact: Core Practice of a Math Tutoring Agent

Can a reliable math tutoring assistant be built using a small language model (SLM) with only 1.5 billion parameters? This project uses Unsloth for efficient fine-tuning, code generation verification and execution, and the LangChain agent architecture to prove that SLMs can also achieve high-quality mathematical reasoning, providing a feasible path for low-cost deployment of educational AI and challenging the industry stereotype that "bigger models are better".

2

Section 02

Background: Potential and Challenges of Small Models in Educational Scenarios

The industry is keen on pursuing large models with tens of billions or hundreds of billions of parameters, but the education sector (e.g., math tutoring for high school students) needs small models that can run on ordinary devices more—their advantages include lower deployment costs, faster response times, better privacy protection, and the possibility of offline use. The traditional view holds that small models are incapable of complex mathematical reasoning, and this project (slm-math-reasoning-agent) is challenging this stereotype.

3

Section 03

Core Technical Approach: From Model to Agent Construction

The project adopts a "Plan-Code-Execute-Explain" pipeline: receive the problem and generate a solution plan → generate Python code → execute the code to get results → integrate into a student-friendly explanation. The base model selected is Qwen2.5-1.5B-Instruct, which reduces training resource requirements through the Unsloth framework + QLoRA technology (4-bit quantized fine-tuning); after fine-tuning, it is packaged as a LangChain agent, using Pydantic to manage structured states and achieve dynamic applications.

4

Section 04

Evidence Support: Innovation in Dataset and Evaluation System

The training data uses the generated_code-gsm8k-plan dataset (extended from GSM8K), where each sample includes a problem, reasoning plan, code, and answer, helping the model with logical decomposition and precise calculation. The evaluation uses "LLM-as-a-Judge" (DeepSeek API), assessing from four dimensions: answer correctness, reasoning quality, expression clarity, and student-friendliness, which goes beyond traditional exact matching metrics.

5

Section 05

Practical Value in Educational Scenarios

The value of this project in educational scenarios: 1. Homework assistance: helps students understand solutions instead of directly giving answers; 2. Learning companion: 24/7 personalized tutoring; 3. Teaching tool: cultivates logical thinking. Compared to general-purpose large models, its advantages lie in controllability (no deviation from the topic, no inappropriate content) and consistency (stable behavior), which meets the needs of educational institutions and parents.

6

Section 06

Future Outlook and Implications for Educational AI

The technology stack covers from training to deployment (Unsloth, Transformers/PEFT, TRL, LangChain/LangGraph, Pydantic, DeepSeek API). Future expansion directions: integrate more math domain data, multimodal capabilities (handwritten formula recognition), interactive interfaces, and learning progress tracking. Implications: Educational AI should prioritize building dedicated small models, make up for their deficiencies through tool enhancement (e.g., code execution), and promote the democratization of AI technology and educational equity.