# aexp: An Experiment Control Plane for AI Researchers

> aexp is a lightweight Go-based experiment control plane that provides remote GPU experiment management solutions for researchers and programming agents. While retaining the simplicity of SSH, it adds support for run tracking, tmux execution, resource monitoring, structured metrics, and MCP tools.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-06-12T06:46:11.000Z
- 最近活动: 2026-06-12T06:53:13.286Z
- 热度: 163.9
- 关键词: experiment management, GPU, SSH, tmux, MCP, Go, SQLite, remote execution, agent-friendly, 开源
- 页面链接: https://www.zingnex.cn/en/forum/thread/aexp-ai
- Canonical: https://www.zingnex.cn/forum/thread/aexp-ai
- Markdown 来源: floors_fallback

---

## 【Introduction】aexp: A Lightweight Experiment Control Plane for AI Researchers

aexp is a lightweight experiment control plane developed in Go, designed specifically for researchers and programming agents to address the pain points of remote GPU server experiment management. While retaining the simplicity of SSH, it adds support for run tracking, tmux execution, resource monitoring, structured metrics, and MCP tools, providing structured management capabilities for remote experiments—especially suitable for AI-assisted research scenarios.

## Background: Pain Points in Remote GPU Experiment Management

Traditional remote GPU experiment management relies on SSH sessions, manual operations with `nohup`/tmux, and memorizing log paths. This is already cumbersome for human users, and a disaster for programming agents—after session changes, they cannot track submitted content, log locations, or running status. aexp was created precisely to solve this pain point.

## Technical Architecture: Simple yet Powerful Design Choices

- **Tech Stack**: Developed in Go, offering advantages like single binary file, cross-platform support, high performance, and static compilation;
- **Data Storage**: Uses SQLite, zero configuration, single-file storage, easy to back up and migrate;
- **Remote Execution**: Uses tmux as the backend, supporting session persistence, output capture, real-time logs, and tmux is pre-installed on most servers.

## Core Features: SSH Convenience + Structured Management Capabilities

### Comparison with Original SSH Method
| Requirement | Original SSH Method | aexp Method |
|---|---|---|
| Quick command check | `ssh host ...` | `aexp exec --resource gpu -- ...` |
| Launch long-term experiments | Manual tmux/nohup | `aexp run submit ...` |
| Find logs | Memorize paths | `aexp run logs <run_id>` |
| View resources | Remote commands | Web dashboard + resource snapshots |
| Restore context | Reconstruct shell history | Query runs/events/metrics |
| Distinguish run types | Naming conventions | Native `--kind setup|smoke|formal|ablation` |

### Core Features
- Resource Management: Register SSH resources, configure environment/GPU tags, etc.;
- Run Submission: Support multiple run types, automatic environment detection;
- Project Sync: Integrate rsync for local-remote file synchronization;
- Event System: Structured recording of progress/parameter/metric events;
- Web Dashboard: Resource overview, real-time logs, monitoring, etc.

## MCP Integration: Interfaces Tailored for Programming Agents

aexp natively supports the MCP (Model Context Protocol) open protocol proposed by Anthropic, facilitating agent interaction:

- **Installation Command**: `aexp mcp install --target all` (automatically configures MCP tools);
- **Available Interfaces**: `aexp_exec` (quick check), `aexp_submit_run` (submit experiment), `aexp_sync_push` (sync files), etc.;
- **Value**: Solves the problem of agent context loss, enabling programmable and automated experiment management.

## Use Cases and Quick Start Guide

### Use Cases
- Machine learning researchers: Complete experiment tracking, automated log/metric collection;
- Agent workflows: Reliable experiment submission/monitoring, session-to-session state persistence;
- Team collaboration: Share SQLite databases, view each other's experiments.

### Installation Methods
- **Binary**: `curl -fsSL https://raw.githubusercontent.com/murasame612/aexp/main/scripts/install.sh | sh`;
- **Source Compilation**: `git clone ... && go build`.

### Common Workflows
- Quick check: `aexp exec --resource gpu-box -- df -h /workspace`;
- Submit experiment: `aexp run submit --resource gpu-box --kind formal ...`;
- Monitor run: `aexp run logs run_xxx --tail 100`.

## Limitations and Future Development Directions

### Current Limitations
- Mainly targets SSH resources; does not support Docker/Slurm/Kubernetes, etc.;
- Open-source version remains simple: local binary + SQLite + SSH + tmux.

### Future Directions
- Support containerized execution (Docker);
- Integrate Slurm cluster scheduling;
- Local execution mode;
- More rich visualization and experiment comparison tools.

## Conclusion: A Bridge Connecting Researchers and Computing Resources

aexp combines the flexibility of SSH with the structured capabilities of modern experiment management, providing a friendly interface for agents via the MCP protocol. It is a lightweight yet fully functional remote GPU experiment management solution. Its open-source nature and agent-friendly design make it an important direction for AI-assisted research tools, and it will play a greater role in connecting human researchers with computing resources in the future.
