Zing Forum

Reading

MouseArmImitationLearning: A Neural Network Training Framework for Mouse Forelimb Movement Control Based on Imitation Learning

An imitation learning project that uses reinforcement learning to train deep neural networks for controlling biomechanical and torque-driven models, achieving precise movement control by minimizing the difference between expected and actual movements.

模仿学习强化学习生物力学MuJoCoPPOLSTM神经运动控制StableBaselines3
Published 2026-05-14 14:25Recent activity 2026-05-14 14:31Estimated read 8 min
MouseArmImitationLearning: A Neural Network Training Framework for Mouse Forelimb Movement Control Based on Imitation Learning
1

Section 01

Introduction to the MouseArmImitationLearning Project

MouseArmImitationLearning is an open-source project developed by the Al Borno Lab at the University of Colorado Denver, completed by Dylan Zelkin under the supervision of Mazen Al Borno. This project focuses on imitation learning, using reinforcement learning techniques to train deep neural networks to control the biomechanical model of the mouse forelimb. Its core goal is to minimize the difference between the expected movement trajectory and the actual executed movement to achieve precise control. The project uses technologies such as the MuJoCo physics engine, PPO algorithm, and LSTM network, and is applied in fields like neuroscience, robotics, and rehabilitation medicine.

2

Section 02

Scientific Background and Motivation

Importance of Biomechanical Modeling

Understanding the movement control mechanisms of organisms is of great significance to neuroscience, robotics, and rehabilitation medicine. The mouse forelimb has a simple nervous system but complex movement patterns, making it an ideal model for studying mammalian movement control.

Advantages of Imitation Learning

Traditional movement control requires manual design of complex controllers, while imitation learning allows the network to automatically generate control signals by observing expected trajectories, making it suitable for handling the high nonlinearity and coupling characteristics of biomechanical systems.

3

Section 03

Technical Architecture and Implementation

Physical Simulation Environment

  • Adapted from the biomechanical model by Gilmer et al., ported to the MuJoCo engine
  • Supports two driving modes: torque-driven (simplified control) and muscle-driven (biologically realistic)

Reinforcement Learning Algorithm

Uses the StableBaselines3 library to implement the PPO algorithm, known for its stability and sample efficiency

Neural Network Architecture

Shared LSTM backbone, including a reward head (estimates state value function) and an action head (outputs action probabilities), capturing temporal dependencies

Generalized Movement Learning

By adding a future kinematic position difference vector to the observation space, train a generalized model that can execute arbitrary movements (enable by adjusting the path_steps parameter)

4

Section 04

Core Configuration Parameters

  • General Parameters: name (model name)
  • Environment Parameters: model (MuJoCo model file), kinematics (kinematic data), train_ratio (training ratio), etc.
  • Reward Function Weights: w_bone_diff (bone position difference), w_paw (paw difference), w_effort (actuator effort), etc.
  • Simulation Parameters: control_dt (simulation time step), n_substeps (number of substeps)
  • Policy Network Parameters: lstm_hidden_size (LSTM size), n_lstm_layers (number of layers)
  • Algorithm Parameters: learning_rate (learning rate), batch_size (batch size)
  • Training/Testing Parameters: timesteps (total steps), eval_freq (evaluation frequency), slowmo (slow motion)
5

Section 05

Usage Workflow

  1. Environment Setup: Create an environment using conda (environment.yml), optionally install TensorBoard and Huggingface Hub
  2. Model Download: Obtain MuJoCo models, kinematic data, etc., from Huggingface Hub
  3. Training: Run train.py; results are saved to ./agents/
  4. Visualization: Use TensorBoard to view metrics like reward curves and losses
  5. Testing: Run test.py to verify model performance in the real-time viewer
6

Section 06

Scientific Value and Application Prospects

  • Neuroscience: Explore movement control strategies and verify theoretical hypotheses
  • Robotics: Apply to robot arm control, gait generation, and learning skills from human demonstrations
  • Rehabilitation Medicine: Simulate movement disorders caused by nerve damage and test rehabilitation intervention plans
  • Biomechanics: Study the characteristics of the musculoskeletal system and verify biomechanical hypotheses
7

Section 07

Technical Challenges and Solutions

  • High-dimensional Continuous Control: Use deep RL + LSTM for end-to-end learning to avoid manual design
  • Simulation Stability: Adjust n_substeps to increase the number of simulation substeps
  • Reward Design: Provide multiple adjustable weight components to adapt to different tasks
  • Generalization Ability: Introduce future kinematic information as input to achieve arbitrary movement control
8

Section 08

Project Summary

MouseArmImitationLearning is a fully functional imitation learning research platform that combines biomechanical modeling, physical simulation, and deep RL, providing a powerful tool for neural movement control research. The project is open-source with detailed documentation, supports parameter adjustment and generalized movement learning, and has great potential for applications in multiple fields in the future.