# Albert Bot: A Browser Automation Tool for Automatically Completing Albert.io Exercises Using Local Large Language Models

> An educational automation project based on Playwright and local LLMs, demonstrating how to combine browser automation with large language models to achieve intelligent parsing and interaction of web content.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-16T02:43:56.000Z
- 最近活动: 2026-05-16T02:52:46.843Z
- 热度: 159.8
- 关键词: 浏览器自动化, Playwright, 本地大模型, LLM, Python, 教育技术, LM Studio, RPA
- 页面链接: https://www.zingnex.cn/en/forum/thread/albert-bot-albert-io
- Canonical: https://www.zingnex.cn/forum/thread/albert-bot-albert-io
- Markdown 来源: floors_fallback

---

## Introduction / Main Floor: Albert Bot: A Browser Automation Tool for Automatically Completing Albert.io Exercises Using Local Large Language Models

An educational automation project based on Playwright and local LLMs, demonstrating how to combine browser automation with large language models to achieve intelligent parsing and interaction of web content.

## Project Background and Motivation

Albert.io is a popular online learning platform that offers adaptive skill practice and assessments. For learners, repeatedly completing practice exercises is an effective way to consolidate knowledge, but manual operation is often time-consuming and repetitive. The Albert Bot project was born out of the developer's interest in exploring browser automation technology and the reasoning capabilities of local large language models (LLMs).

The core goal of this project is not to simply "cheat" or bypass academic integrity policies, but to serve as a **learning-oriented experimental project** that demonstrates how to combine modern AI tools with browser automation frameworks to achieve intelligent web interaction. The developer explicitly emphasized this point and included a detailed usage disclaimer in the project documentation.

## Technical Architecture Overview

Albert Bot adopts a modular Python architecture, which mainly includes the following core components:

## 1. Browser Automation Layer (Playwright)

The project uses Playwright as the browser automation engine. Playwright is a modern browser automation tool open-sourced by Microsoft, supporting Chromium, Firefox, and WebKit. Compared to traditional Selenium, Playwright provides more stable APIs, better asynchronous support, and more powerful network interception capabilities.

In Albert Bot, Playwright is responsible for:
- Automatically logging into the Albert.io platform
- Navigating to the specified skill practice page
- Capturing screenshots of the question area or extracting HTML content
- Simulating user interactions such as clicks and inputs
- Monitoring page state changes and identifying answer results

## 2. Multi-Backend LLM Support

The project has designed a flexible Solver architecture that supports multiple LLM backends:

- **Local LLM (LM Studio)**: Run via a local HTTP server, completely offline, protecting privacy
- **Claude API**: Anthropic's Claude series models
- **OpenAI API**: GPT series models
- **Gemini API**: Google's Gemini models

This multi-backend design allows users to choose the most suitable model based on their needs and resources. Local operation is particularly suitable for scenarios sensitive to data privacy, while cloud APIs provide stronger reasoning capabilities.

## 3. Configuration and Logging System

The project adopts a design that separates environment variables and configuration files:
- The `.env` file stores sensitive information (API keys, login credentials)
- The `config.py` file stores application configurations (URL lists, solver type, model selection)

The logging system creates a timestamped directory for each run, containing:
- `session.jsonl`: Detailed question-by-question operation records
- `stats.json`: Session statistics (number of correct/incorrect answers, level changes, etc.)

## Core Workflow

The workflow of Albert Bot embodies a typical pattern of combining browser automation with AI:

1. **Initialization Phase**: Load configurations, launch the browser, log into the platform
2. **Question Detection**: Monitor the page to identify the appearance of new practice questions
3. **Content Extraction**: Extract key information such as question text, options, and images
4. **AI Reasoning**: Send the formatted question to the selected LLM backend
5. **Answer Execution**: Parse the answer returned by the LLM and simulate user selection or input
6. **Result Monitoring**: Wait for page feedback, record the answer result, and proceed to the next question

This workflow shows how to use the LLM as the "brain" and Playwright as the "hands and feet" to achieve end-to-end automated task processing.

## HTML Extraction and Content Understanding

The project needs to handle the complex dynamically generated HTML structure of Albert.io and extract meaningful question content. This involves:
- Using CSS selectors or XPath to locate elements
- Processing rich text content and mathematical formulas
- Identifying question types (multiple choice, fill-in-the-blank, etc.)
