Reading

VLA-HRM: Innovative Application of Recursive Reasoning Models in Robotic Control

This project applies TRM (Tiny Recursive Model) and HRM (Hierarchical Reasoning Model) to robotic manipulation tasks. Through recursive weight-sharing computation and continuous observation encoding, it outperforms the diffusion policy baseline in the PushT task.

机器人学习递归模型模仿学习强化学习扩散策略机器人控制开源项目

Published 2026-03-30 19:03Recent activity 2026-03-30 19:23Estimated read 5 min

VLA-HRM: Innovative Application of Recursive Reasoning Models in Robotic Control

Section 01

VLA-HRM Project Introduction: Innovative Application of Recursive Reasoning Models in Robotic Control

The VLA-HRM project adapts TRM (Tiny Recursive Model) and HRM (Hierarchical Reasoning Model), originally used for discrete reasoning tasks, to continuous robotic control scenarios. For the PushT task (pushing a T-shaped block to a target position), through designs like continuous observation encoding and recursive weight sharing, it outperforms the diffusion policy baseline in performance while being more parameter-efficient.

Section 02

Background: Challenges from Discrete Reasoning to Continuous Robotic Control

Recursive reasoning models (such as TRM/HRM) were initially used for discrete tasks (sudoku, mazes, etc.). However, robotic control (like the PushT task) has a continuous observation space (5-dimensional state: agent position, block position, angle) and a continuous action space (2-dimensional target position), requiring long-term planning and dealing with complex contact dynamics. Adapting discrete reasoning models to continuous control scenarios is the core challenge of the VLA-HRM project.

Section 03

Evolution of Technical Solutions and Core Model Architecture

The project went through three iterations: V1 (discrete observation/action, failed) → V2 (continuous observation + discrete action, partially successful) → V3 (fully continuous, breakthrough). Core architecture: TRM uses a recursive design with weight sharing (a single module handles both high and low levels, memory-efficient); HRM introduces explicit hierarchy (high-level planning, low-level control); innovations include action query tokens supporting parallel action decoding.

Section 04

Training Strategies and Key Optimization Techniques

The project uses various training techniques to improve performance: observation noise augmentation (Gaussian noise to prevent overfitting), geometric feature engineering (21 hand-designed geometric features to inject domain knowledge), data augmentation (mirror symmetry to expand data by 4x), iterative refinement (multi-step improvement of action sequences, reaching a single-score of 0.942 at K=8 steps).

Section 05

Experimental Result Analysis and Comparison

Results show: HRM V8 (h=384) achieves an average score of 0.558, outperforming the diffusion policy (0.507) while having only 1/8 the number of parameters; continuous regression action representation is better than discrete quantization; TRM slightly outperforms HRM under the same configuration (speculated that the PushT task has no obvious hierarchy).

Section 06

Key Insights and Future Directions

Key insights: Continuous representation is crucial for robotic control; recursive architecture is suitable for sequential decision-making; observation augmentation effectively prevents overfitting; geometric priors accelerate learning. Limitations: Only supports state input, single-task specialization, simulation environment. Future directions: VLA expansion (Vision-Language-Action), multi-task learning, real robot validation, fusion with diffusion models.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15