Zing Forum

Reading

Interact-LLM: An Experimental Framework for Large Language Models as Cognitive Tutors in Language Learning

An open-source codebase from the INTERACT-LLM project at Aarhus University in Denmark, exploring the potential of large language models (LLMs) as cognitive tutors in language learning scenarios. It includes components such as an inference engine, a terminal chatbot, and alignment drift experiments.

大语言模型语言学习认知导师教育AI对齐漂移LLM推理奥胡斯大学交互式学习
Published 2026-04-21 16:46Recent activity 2026-04-21 16:59Estimated read 8 min
Interact-LLM: An Experimental Framework for Large Language Models as Cognitive Tutors in Language Learning
1

Section 01

[Introduction] Interact-LLM: Exploring Large Language Models as Cognitive Tutors in Language Learning via an Experimental Framework

The open-source codebase released by the INTERACT-LLM project at Aarhus University in Denmark aims to explore the potential of large language models (LLMs) as cognitive tutors in language learning scenarios. This project includes core components such as an inference engine, a terminal chatbot, and alignment drift experiments, providing reusable experimental tools for language learning researchers, AI education developers, AI safety researchers, and other groups to promote the application of combining LLMs with the concept of cognitive tutoring.

2

Section 02

Project Background and Research Motivation

INTERACT-LLM is a project initiated by an interdisciplinary team at Aarhus University in Denmark. Its core hypothesis is: LLMs can not only serve as information providers but also act as cognitive tutors through designed interaction patterns, helping learners build knowledge, correct errors, and provide feedback. Traditional language learning software focuses on vocabulary and grammar exercises, lacking key interaction and feedback links; while the cognitive tutor concept originates from educational psychology, emphasizing Socratic questioning, immediate feedback, and scaffolding support. Combining with LLMs' open-domain dialogue capabilities, it is expected to create more adaptive and personalized learning experiences.

3

Section 03

Core Components of the Codebase

The Interact-LLM codebase consists of two core parts:

  1. Inference Engine and Terminal Chatbot (interact_llm module):Implements the LLM inference engine and terminal interaction interface, currently supporting the role of a Spanish tutor. Through prompt engineering and context management strategies, it meets the needs of educational scenarios (tracking knowledge status, identifying misunderstandings, providing targeted feedback); the terminal interface facilitates observation and debugging for researchers.
  2. Collection of Experimental Scripts (scripts directory):Contains experimental settings related to specific papers, such as the alignment drift experiment associated with Almasi & Kristensen-McLachlan (2025) (studying the problem of behavior alignment drift in long-term LLM interactions). Experimental code and analysis code are separated (analysis code is stored in the INTERACT-LLM/alignment-drift-llms repository).
4

Section 04

Technical Implementation Details

The project's technical features include:

  • Dependency Management: Uses the uv tool + Makefile for automated environment configuration; executing make setup with one click installs dependencies and creates a virtual environment.
  • Model Support: Compatible with open-source LLMs such as Llama-3.1-8B-Instruct. Gated models are accessed via a Hugging Face Token (stored in tokens/hf_token.txt, not included in Git).
  • Cross-Platform Compatibility: Developed and tested on Python 3.12.3, supporting macOS 15.3.1 and Ubuntu 24.04 to ensure research reproducibility.
5

Section 05

Research Methodology and Experimental Design

The project follows rigorous academic methods:

  • Version Tags and Paper Association: Semantic version tags (e.g., vX.X.X-alignment-drift) are bound to specific papers to ensure result traceability.
  • Separation of Code and Analysis: Experimental code (inference/interaction logic) and analysis code (statistics/visualization) are in separate repositories, achieving separation of concerns, reducing security risks, and improving reusability.
  • Reproducibility Commitment: Each experiment directory contains a detailed README to guide result reproduction.
6

Section 06

Application Scenarios and Potential Value

Interact-LLM has reference value for the following groups:

  • Language learning researchers: Can directly use/modify the framework to test teaching hypotheses;
  • AI education application developers: Can draw on prompt engineering and context management methods;
  • AI safety researchers: Can use alignment drift experiment tools to study LLM behavior stability;
  • Computational linguistics scholars: Can obtain empirical data of LLMs in the field of language learning.
7

Section 07

Limitations and Future Outlook

Current status and notes of the project:

  • Early Development: The code is for internal use only, not production-ready; APIs and structures may change frequently;
  • Function Limitations: Functions are relatively limited; generality and configurability need to be improved;
  • Model Dependency: Experimental results are affected by the used LLM; attention should be paid to result interpretation;
  • Ethical Considerations: Need to follow ethical review procedures (data privacy, algorithmic bias, etc.). Future plans: Migrate model support to Gemma4 27B and continuously optimize performance.