# RSI-DNAX: Experimental Exploration of Bounded Recursive Self-Improving Neural Networks

> An experimental framework for studying bounded recursive self-improvement mechanisms. Through validation-gated code-level operator evolution, it achieves significant improvements on the ARC-AGI benchmark, demonstrating a feasible path for AI self-improvement in a controlled environment.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-18T22:43:21.000Z
- 最近活动: 2026-05-18T22:49:44.735Z
- 热度: 161.9
- 关键词: recursive self-improvement, ARC-AGI, neural architecture search, meta-learning, AI safety, benchmark evaluation, code evolution, cognitive architecture, automated reasoning
- 页面链接: https://www.zingnex.cn/en/forum/thread/rsi-dnax
- Canonical: https://www.zingnex.cn/forum/thread/rsi-dnax
- Markdown 来源: floors_fallback

---

## RSI-DNAX: Guide to Bounded Exploration of Controlled Recursive Self-Improving Neural Networks

RSI-DNAX is an experimental framework for studying bounded recursive self-improvement mechanisms. Through validation-gated code-level operator evolution, it achieves significant improvements on the ARC-AGI benchmark, demonstrating a feasible path for AI self-improvement in a controlled environment. The project is positioned as a non-AGI research scaffold, focusing on auditable bounded improvement cycles, allowing researchers to observe and debug each step of the improvement process.

## Background and Project Positioning

Recursive Self-Improvement (RSI) can theoretically lead to exponential growth in capabilities, but controllability is a practical challenge. RSI-DNAX is not an AGI or singularity proof; its core goal is to build inspectable and understandable bounded improvement cycles: generating restricted operator programs, non-test set validation, rejecting/rolling back failed attempts, freezing accepted states, and reporting results. It is positioned as a CPU-runnable research tool, prioritizing the exploration of the improvement mechanism itself rather than general intelligence.

## Core Architecture and Method Design

### Cognitive Core
The "brain" of the system, responsible for task reasoning, memory management, world model construction, and bounded improvement control, coordinating subsystems to ensure operation within constraints.
### Adaptive Operator System
The execution layer for self-improvement, including operators and their genome representations, achieving iterative improvement through generating, validating, and selecting operators.
### Candidate Generation and Sandbox
The generator performs deterministic mutation and recombination; the sandbox provides an isolated validation environment to prevent failures from affecting the main system, serving as a safety barrier.
### Failure Grammar
Records failed candidates and extracts rules to guide subsequent generation and avoid repeated errors, improving exploration efficiency.
### Evaluator Evolution
The evaluator undergoes tentative mutations under adversarial checks to ensure evaluation criteria keep up with system development, belonging to meta-level evolution.

## Experimental Results on ARC-AGI Benchmark

In the ARC-AGI-1 isomorphic subset test (gold standard for abstract reasoning):
- Full mode (seed42): Cell accuracy increased from 0.668 to 1.0 (+33%), exact grid accuracy from 0 to 1;
- Fast mode: Exact grid accuracy reached 0.4;
- Cross-seed expansion: Average retained cell accuracy from 0.875 to 0.931, average exact grid accuracy from 0.333 to 0.458.
All results are ensured to be credible through anti-cheating checks (data isolation, deterministic replay, dead code detection, etc.).

## Code-level and Architecture-level Self-Improvement Mechanisms

### Code-level Improvement
Code-level self-improvement is achieved through operator DSL, generating/modifying operator programs and recursively applying improvement mechanisms (improving both task strategies and the improvement process itself). The HumanEval adapter verifies this capability.
### Architecture Evolution
The neural_search module supports deterministic mutation and weight inheritance of architecture genomes; World Model V2 introduces object-centric representation, causal graphs, and counterfactual reasoning, laying the foundation for complex reasoning.

## Anti-cheating and Auditability Guarantees

To ensure credible results, multiple mechanisms are implemented:
- Data segmentation and isolation: Strict training/validation/test splitting to prevent information leakage;
- Deterministic replay: All experiments are reproducible;
- Dead code detection: Exclude the impact of unused code paths;
- Control strategy audit: Check whether improvements follow safety constraints.
These mechanisms provide a reliable foundation for research.

## Limitations and Future Directions

### Limitations
- ARC results are not official leaderboard scores;
- HumanEval tests do not prove general programming ability;
- Exact grid accuracy for seed44 remains 0.0, indicating limited gains.
### Future Plans
Upgrade interactive residual layers, meta-RSI coordination, and deep architecture while maintaining the principle of bounded auditability.

## Implications for AI Research

The core lessons from RSI-DNAX:
1. Boundaries are key: Unconstrained improvement is dangerous and difficult to study;
2. Auditability first: Each improvement step needs to be inspectable and verifiable;
3. Learn from failure: The failure grammar mechanism effectively utilizes negative experiences;
4. Multi-level improvement: Multi-dimensional evolution (operators, architecture, etc.) brings compound effects.
It serves as a platform for control mechanisms for safety researchers and demonstrates improvement paths for capability researchers, having dual value.
