Zing Forum

Reading

aexp: An Experiment Control Plane for AI Researchers

aexp is a lightweight Go-based experiment control plane that provides remote GPU experiment management solutions for researchers and programming agents. While retaining the simplicity of SSH, it adds support for run tracking, tmux execution, resource monitoring, structured metrics, and MCP tools.

experiment managementGPUSSHtmuxMCPGoSQLiteremote executionagent-friendly开源
Published 2026-06-12 14:46Recent activity 2026-06-12 14:53Estimated read 7 min
aexp: An Experiment Control Plane for AI Researchers
1

Section 01

【Introduction】aexp: A Lightweight Experiment Control Plane for AI Researchers

aexp is a lightweight experiment control plane developed in Go, designed specifically for researchers and programming agents to address the pain points of remote GPU server experiment management. While retaining the simplicity of SSH, it adds support for run tracking, tmux execution, resource monitoring, structured metrics, and MCP tools, providing structured management capabilities for remote experiments—especially suitable for AI-assisted research scenarios.

2

Section 02

Background: Pain Points in Remote GPU Experiment Management

Traditional remote GPU experiment management relies on SSH sessions, manual operations with nohup/tmux, and memorizing log paths. This is already cumbersome for human users, and a disaster for programming agents—after session changes, they cannot track submitted content, log locations, or running status. aexp was created precisely to solve this pain point.

3

Section 03

Technical Architecture: Simple yet Powerful Design Choices

  • Tech Stack: Developed in Go, offering advantages like single binary file, cross-platform support, high performance, and static compilation;
  • Data Storage: Uses SQLite, zero configuration, single-file storage, easy to back up and migrate;
  • Remote Execution: Uses tmux as the backend, supporting session persistence, output capture, real-time logs, and tmux is pre-installed on most servers.
4

Section 04

Core Features: SSH Convenience + Structured Management Capabilities

Comparison with Original SSH Method

Requirement Original SSH Method aexp Method
Quick command check ssh host ... aexp exec --resource gpu -- ...
Launch long-term experiments Manual tmux/nohup aexp run submit ...
Find logs Memorize paths aexp run logs <run_id>
View resources Remote commands Web dashboard + resource snapshots
Restore context Reconstruct shell history Query runs/events/metrics
Distinguish run types Naming conventions Native `--kind setup

Core Features

  • Resource Management: Register SSH resources, configure environment/GPU tags, etc.;
  • Run Submission: Support multiple run types, automatic environment detection;
  • Project Sync: Integrate rsync for local-remote file synchronization;
  • Event System: Structured recording of progress/parameter/metric events;
  • Web Dashboard: Resource overview, real-time logs, monitoring, etc.
5

Section 05

MCP Integration: Interfaces Tailored for Programming Agents

aexp natively supports the MCP (Model Context Protocol) open protocol proposed by Anthropic, facilitating agent interaction:

  • Installation Command: aexp mcp install --target all (automatically configures MCP tools);
  • Available Interfaces: aexp_exec (quick check), aexp_submit_run (submit experiment), aexp_sync_push (sync files), etc.;
  • Value: Solves the problem of agent context loss, enabling programmable and automated experiment management.
6

Section 06

Use Cases and Quick Start Guide

Use Cases

  • Machine learning researchers: Complete experiment tracking, automated log/metric collection;
  • Agent workflows: Reliable experiment submission/monitoring, session-to-session state persistence;
  • Team collaboration: Share SQLite databases, view each other's experiments.

Installation Methods

  • Binary: curl -fsSL https://raw.githubusercontent.com/murasame612/aexp/main/scripts/install.sh | sh;
  • Source Compilation: git clone ... && go build.

Common Workflows

  • Quick check: aexp exec --resource gpu-box -- df -h /workspace;
  • Submit experiment: aexp run submit --resource gpu-box --kind formal ...;
  • Monitor run: aexp run logs run_xxx --tail 100.
7

Section 07

Limitations and Future Development Directions

Current Limitations

  • Mainly targets SSH resources; does not support Docker/Slurm/Kubernetes, etc.;
  • Open-source version remains simple: local binary + SQLite + SSH + tmux.

Future Directions

  • Support containerized execution (Docker);
  • Integrate Slurm cluster scheduling;
  • Local execution mode;
  • More rich visualization and experiment comparison tools.
8

Section 08

Conclusion: A Bridge Connecting Researchers and Computing Resources

aexp combines the flexibility of SSH with the structured capabilities of modern experiment management, providing a friendly interface for agents via the MCP protocol. It is a lightweight yet fully functional remote GPU experiment management solution. Its open-source nature and agent-friendly design make it an important direction for AI-assisted research tools, and it will play a greater role in connecting human researchers with computing resources in the future.