Zing Forum

Reading

GradeOps: An AI Automatic Grading System Based on Multimodal Large Models and LangGraph

GradeOps is an open-source AI grading engine that combines vision-language models with LangGraph agent workflows to enable automatic recognition, scoring, and manual review processes for handwritten exam papers.

AI阅卷LangGraph视觉语言模型教育科技FastAPI人机协作OCRGeminiNemotron
Published 2026-06-14 22:14Recent activity 2026-06-14 22:21Estimated read 8 min
GradeOps: An AI Automatic Grading System Based on Multimodal Large Models and LangGraph
1

Section 01

GradeOps: Guide to the AI Automatic Grading System Based on Multimodal Large Models and LangGraph

GradeOps is an open-source AI grading engine that combines vision-language models with LangGraph agent workflows to enable automatic recognition, scoring, and manual review processes for handwritten exam papers. The project uses FastAPI to build the backend, and adopts the "AI initial grading + manual review" human-machine collaboration model to improve efficiency and quality, providing a modular and engineering-practical reference implementation for the edtech field.

2

Section 02

Project Background and Problem Scenarios

Project Background and Problem Scenarios

Traditional exam paper grading faces dual challenges of efficiency and accuracy: difficulty in handwritten recognition, insufficient consistency in subjective question scoring, and slow grading speed for large-scale exams. The maturity of large language models and vision-language models provides possibilities to solve these problems.

GradeOps Backend is designed for this scenario, with the core being the "AI initial grading + manual review" human-machine collaboration model, which not only improves efficiency but also ensures quality. It is an edtech engineering example built with FastAPI + LangGraph.

3

Section 03

Core Architecture Design

Core Architecture Design

Two-Stage AI Pipeline

The first stage uses NVIDIA Nemotron vision-language model (OpenRouter service) to handle OCR recognition of handwritten text and mathematical formulas; the second stage uses Google Gemini to score based on nested JSON grading standards and provide reasoning explanations. The two stages are decoupled to facilitate component upgrades and replacements.

LangGraph Agent Orchestration

Using LangGraph to model the grading lifecycle as a state machine. When manual intervention is needed, it automatically pauses and persists the state to SQLite via SqliteSaver, ensuring the review queue survives across requests.

Human-Machine Collaboration Mechanism

Bridging the AI pipeline and frontend via REST API. Teaching assistants can view AI scores, adjust and confirm them, and the results are written back to the LangGraph state. Teachers hold the final decision-making power.

4

Section 04

Detailed Tech Stack

Detailed Tech Stack

Backend Framework

  • FastAPI: High-performance asynchronous Python web framework that supports automatic OpenAPI documentation generation

AI and Orchestration

  • LangGraph: Agent workflow framework for building grading state machines
  • LangChain: Integrates various model providers

Model Configuration

  • Google Gemini (gemini-3.1-flash-lite): Scoring reasoning and grading
  • NVIDIA Nemotron (nvidia/nemotron-nano-12b-v2-vl:free): Handwritten OCR service provided by OpenRouter

Data Layer

  • PostgreSQL: Stores user information, exam paper metadata, etc.
  • SQLite: LangGraph state checkpoints (checkpoints.db)
  • SQLAlchemy: ORM layer for unified data access

Security and Tools

  • PyJWT: Session token signing
  • pwdlib[argon2]: Password hashing
  • PyMuPDF: PDF exam paper parsing
5

Section 05

Key Design Highlights

Key Design Highlights

Asynchronous Background Processing

Put OCR recognition and scoring calculation into FastAPI background tasks. The upload interface responds quickly, and model inference is executed in the background, ensuring API responsiveness and real-time dashboard updates.

Role-Based Access Control

Distinguish between teacher and teaching assistant permissions. Authentication status is passed via httponly cookies to reduce the risk of XSS attacks.

Relational Data Optimization

Optimize data queries for large-scale exam scenarios, efficiently aggregate batch scoring statistics, and avoid performance bottlenecks as data grows.

6

Section 06

Deployment and Usage Guide

Deployment and Usage

The project provides a complete local development guide:

  1. Environment configuration: Manage sensitive information such as database connections and API keys via .env
  2. Virtual environment: Supports Windows, macOS, Linux multi-platforms
  3. Dependency management: Detailed requirements.txt and supplementary package instructions
  4. Database initialization: PostgreSQL creation script
  5. API documentation: Automatically generated Swagger UI supports interactive testing

The accompanying React frontend (gradeops-frontend repository) forms a complete educational grading solution.

7

Section 07

Summary and Reflections

Summary and Reflections

GradeOps demonstrates the application of cutting-edge AI technology in traditional education scenarios, and its design philosophy is worth learning:

  1. Human-machine collaboration rather than replacement: AI handles initial grading and standardized processing, while humans retain final decision-making power
  2. Modular architecture: The two-stage model and pluggable LangGraph state machine reflect good decoupling
  3. Engineering practicality: Technical choices such as httponly cookies and asynchronous task queues serve actual deployment needs

It provides developers exploring AI educational applications with a fully functional and clearly structured reference implementation.