# Omni VA: A Local AI Virtual Assistant Supporting Multimodal Interaction

> Omni VA is a desktop virtual assistant based on local large language models, supporting voice interaction, music playback, and multimodal input. It integrates the OmniStep Evolution Radio plugin to provide personalized music recommendations.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-06-08T20:41:08.000Z
- 最近活动: 2026-06-08T20:50:27.558Z
- 热度: 150.8
- 关键词: AI助手, 多模态, 本地模型, 语音交互, Live2D, Qwen, 虚拟助手, 音乐推荐
- 页面链接: https://www.zingnex.cn/en/forum/thread/omni-va-ai
- Canonical: https://www.zingnex.cn/forum/thread/omni-va-ai
- Markdown 来源: floors_fallback

---

## [Introduction] Omni VA: Core Introduction to the Local Multimodal AI Virtual Assistant

Omni VA is a desktop virtual assistant based on local large language models, supporting voice interaction, music playback, and multimodal input/output. Its core highlights include native multimodal design, Live2D virtual avatar presentation, a three-layer architecture system (virtual assistant interface + priority distribution + execution agent), and integration of the OmniStep Evolution Radio plugin for personalized music recommendations. The project is maintained by SouthpawIN, with source code available on GitHub (link: https://github.com/SouthpawIN/nous-girl-agent), and was released on June 8, 2026.

## Project Background and Overview

Omni VA is not just a simple chatbot; it is a complete multi-layer architecture system designed to combine local large models with multimodal interaction capabilities to provide an intelligent and personalized AI companion. Its native multimodal design supports text, voice, and visual input by default, and outputs text and voice, making interactions more natural and rich (e.g., voice conversations, context-aware music playback).

## Technical Architecture and Core Components

Omni VA uses a three-layer architecture to work collaboratively:
1. **Omni VA Interface Layer**: Developed based on a fork of Open-LLM-VTuber, it uses Live2D to present virtual avatars. Key features include low resource consumption, multimodal interaction (web search, notes, etc.), voice priority, and intelligent degradation (using Edge TTS as a fallback for text-only models).
2. **Senter Distribution Layer**: Runs on demand, reads wiki notes generated by the virtual assistant, returns a priority-sorted task list, and intelligently determines the task execution order.
3. **Hermes Agent Execution Layer**: A fully functional toolset agent with code execution, terminal operation, task delegation, and computer control capabilities, executing tasks assigned by Senter.

## Detailed Explanation of the OmniStep Evolution Radio Plugin

OmniStep Evolution Radio is a featured component of the project with self-evolution capabilities:
- **Core Functions**: Perceive user interactions, intelligently generate playlists, LoRA training to learn music preferences, and Ohm evolution chain feedback for improvement.
- **Technical Implementation**: Based on the Qwen2.5-Omni-3B multimodal model (natively supports text/voice/visual input/output). The project maintains a curated model directory (models/curated.yaml) containing 8 entries.

## Model Directory and Configuration Options

The project provides a layered model directory, allowing users to choose based on their hardware and needs:
| Layer | Models |
|------|------|
| Native Multimodal | Qwen2.5-Omni-3B (default), OmniStep (coming soon) |
| Text with TTS | Darwin-28B, APEX-MTP, Qwen3-Coder-30B-A3B, Qwen3.5-27B-Claude, Qwen3.5-27B-Sushi, Qwen3.5-35B-A3B |
| Auxiliary | OmniSenter (Phase 1 coming soon) |

## Installation and Usage Guide

Installation Steps:
1. Clone the repository: `git clone https://github.com/SouthpawIN/nous-girl-agent`
2. Run the installation script: `./scripts/install.sh`
3. Start the full system: `./scripts/dev.sh` (starts VA, radio, agent, and bridge simultaneously)

Start Components Separately:
- Live2D VA: `./scripts/run-assistant.sh`
- Radio Plugin: `./scripts/run-radio.sh start`
- Curator Agent: `./scripts/run-agent.sh`

## Project Significance and Future Outlook

Omni VA represents the development direction of local AI assistants—achieving intelligence and personalization through careful architecture. Its multimodal design, self-evolution capabilities, and layered architecture provide new ideas for local AI applications. For developers interested in AI agents, local large model deployment, and multimodal interaction, it is an excellent project worth studying, demonstrating the combination of cutting-edge AI technology and user needs.
