Zing Forum

Reading

Draco AI V1: A Localized MoE Large Model Based on Qwen

Draco AI V1 is a localized large language model developed based on Qwen 3.5 9B. It transforms the original dense architecture using MoE technology, integrates advanced reasoning capabilities and a memory system, and provides a deeply personalized user experience.

Draco AIMoE混合专家模型Qwen本地化部署记忆系统大语言模型个性化AI
Published 2026-04-24 16:34Recent activity 2026-04-24 16:56Estimated read 7 min
Draco AI V1: A Localized MoE Large Model Based on Qwen
1

Section 01

[Main Thread Guide] Draco AI V1: Core Introduction to the Localized MoE Large Model Based on Qwen

Draco AI V1 is a localized large language model developed based on Alibaba's Qwen 3.5 9B. It transforms the original dense architecture using Mixture of Experts (MoE) technology, integrates advanced reasoning capabilities and a memory system, and is committed to providing a deeply personalized AI experience. Its core advantages include data privacy protection through local deployment, low-latency responses, offline availability, and controllable costs, offering a new option for users who value privacy and personalization.

2

Section 02

Background and Selection of Base Model

Base Model Selection

The project selects Qwen 3.5 9B as the base model because it has excellent Chinese understanding, code generation capabilities, and multilingual support. The 9B parameter scale balances performance and resource consumption, making it suitable for local deployment scenarios.

Advantages of Local Deployment

  • Data Privacy: User data is fully processed locally without cloud upload;
  • Low Latency: Local inference responses are instant;
  • Offline Availability: Works normally without a network environment;
  • Controllable Costs: Avoids ongoing API fees, with long-term use after hardware investment.
3

Section 03

Technical Approach: MoE Architecture Transformation and Core Function Implementation

MoE Architecture Transformation

Converting the dense model to an MoE architecture reduces inference costs through a sparse activation mechanism. Core advantages include parameter efficiency (large parameters but only activating some experts), specialized division of labor (different experts handle different tasks), and scalability (adding experts to expand capabilities). Implementation challenges involve expert initialization, routing network design, load balancing, and optimization of training stability.

Core Function Technologies

  • Advanced Reasoning: Enhances reasoning performance through Chain-of-Thought training, reasoning-specific experts, and RLHF/DPO optimization;
  • Memory System: Includes short-term working memory, long-term episodic memory, user profile memory, and semantic memory, with memory retrieval implemented using vector databases (e.g., FAISS).
4

Section 04

Core Functional Features and Application Scenarios

Memory System Features

The memory system is a key differentiator of Draco AI V1, supporting cross-session memory of user dialogues and preferences, personalized adaptation of response styles, and long-term knowledge accumulation.

Application Scenarios

  • Personal AI Assistant: Remembers schedules and preferences, providing personalized services;
  • Professional Domain Consultant: Can be customized for fields such as law and medicine;
  • Educational Tutoring: Tracks learning progress and provides targeted tutoring;
  • Creative Writing Partner: Remembers creative styles and settings for collaborative writing.
5

Section 05

Current Limitations and Future Development Directions

Limitations and Challenges

  • Hardware Requirements: Smooth operation requires sufficient memory and new CPUs/GPUs;
  • Memory System Complexity: Long-term memory management involves issues like forgetting mechanisms and conflict resolution;
  • Model Updates: Local models need manual updates, and smooth upgrades while retaining memory remain to be solved.

Future Directions

  • Multimodal Expansion: Integrate visual understanding to support image-text dialogue;
  • Tool Usage: Support function calls and integration with external tools;
  • Distributed Memory: Cross-device memory synchronization for a seamless experience under privacy protection.
6

Section 06

Summary and Recommendations

Draco AI V1 represents the development direction of localized large models. It improves efficiency through MoE and achieves personalization via the memory system, providing users with an AI option where data is independently controllable. It is recommended for users who care about privacy and pursue personalized experiences to try this project, while looking forward to further optimization in multimodal integration, tool integration, and other directions.