Zing Forum

Reading

RUS-Former: A Transformer-Based Visual Servoing Robotic System for Carotid Artery Ultrasound Scanning

RUS-Former is an innovative visual servoing robotic control system that combines carotid artery segmentation with a multimodal visual servoing Transformer control model to achieve end-to-end automated ultrasound scanning from image feature extraction to robot motion control.

医疗机器人视觉伺服Transformer超声扫描颈动脉深度学习GitHub机器人控制
Published 2026-06-15 10:10Recent activity 2026-06-15 10:26Estimated read 6 min
RUS-Former: A Transformer-Based Visual Servoing Robotic System for Carotid Artery Ultrasound Scanning
1

Section 01

[Introduction] RUS-Former: A Transformer-Based Visual Servoing Robotic System for Carotid Artery Ultrasound Scanning

RUS-Former is an innovative visual servoing robotic control system developed by IRM-Lab, designed for carotid artery ultrasound scanning tasks. It combines carotid artery segmentation with a multimodal visual servoing Transformer control model to achieve end-to-end automation from image feature extraction to robot motion control. The project is open-sourced on GitHub (link: https://github.com/IRM-Lab/RUS-Former), with an update date of June 15, 2026. This system aims to address many challenges of traditional manual scanning and provide new technical references for the medical robotics field.

2

Section 02

Background: Challenges and Automation Needs of Carotid Artery Ultrasound Scanning

Carotid artery ultrasound examination is an important method for cardiovascular disease risk assessment, but traditional manual scanning faces problems such as complex anatomical structures, image quality relying on experience, fatigue from long-term operation, and difficulty in standardization. Robot-assisted scanning needs to address core challenges like real-time understanding of anatomical structures, precise motion control, and ensuring safety and comfort.

3

Section 03

System Architecture and Methods

RUS-Former adopts an end-to-end learning framework, with core modules including:

  1. Carotid artery segmentation module: multi-scale feature extraction, edge-aware loss, real-time performance optimization;
  2. Multimodal visual servoing Transformer: visual feature encoding (segmentation results + original images), robot state encoding, cross-modal attention mechanism, control output decoding;
  3. End-to-end training: expert data collection, imitation learning, reinforcement fine-tuning.

Overall workflow: Ultrasound image input → Segmentation module → Multimodal fusion → Visual servoing Transformer → Robot control output.

4

Section 04

Core Technical Innovations

  1. Integration of visual servoing and deep learning: Convert visual servoing into a sequence learning task, with temporal modeling and attention guidance to enhance interpretability;
  2. Multimodal information fusion: Integrate three types of information: ultrasound images, segmentation masks, and robot states;
  3. Integration of safety constraints: Force control, motion range, and speed constraints to ensure safety in medical applications.
5

Section 05

Application Scenarios and Value

  1. Clinical diagnosis assistance: standardized scanning, quality assurance, efficiency improvement;
  2. Telemedicine: expert resource deployment to grassroots, real-time monitoring, training and teaching;
  3. Large-scale screening: high throughput, consistency, data accumulation.
6

Section 06

Limitations and Challenges

Current limitations: generalization ability needs improvement, exception handling needs verification, regulatory approval process is complex; Technical challenges: high real-time requirements, safety needs formal verification, balance between human-machine collaboration.

7

Section 07

Future Development Directions

Technical evolution: expansion to multiple sites (thyroid, etc.), 3D ultrasound integration, multimodal fusion (CT/MRI); Clinical translation: clinical trials, regulatory certification, development of clinical training systems.

8

Section 08

Summary

RUS-Former combines deep learning, visual servoing, and medical ultrasound technology to provide an innovative solution for carotid artery ultrasound automation. The end-to-end Transformer architecture enables direct mapping from image understanding to motion control, demonstrating the potential of data-driven methods. In the future, it is expected to play an important role in improving medical quality, reducing costs, and expanding service scope, which is worthy of attention from researchers in cross-disciplinary fields. Project address: https://github.com/IRM-Lab/RUS-Former.