Zing Forum

Reading

micro-kiki: Innovative Practice of 35 Domain Expert LoRAs and Cognitive Layer Architecture

A multi-domain expert system built on Qwen3.6-35B-A3B, which achieves precise reasoning and continuous learning in professional fields through 35 LoRA adapters and a three-layer cognitive architecture consisting of Aeon Memory, CAMP Negotiation, and KnowBias Anti-bias.

LoRAMoEQwen领域专家MLX多模态路由认知架构灾难性遗忘量化推理开源模型
Published 2026-04-21 00:44Recent activity 2026-04-21 00:51Estimated read 6 min
micro-kiki: Innovative Practice of 35 Domain Expert LoRAs and Cognitive Layer Architecture
1

Section 01

micro-kiki: Innovative Practice of 35 Domain Expert LoRAs and Cognitive Layer Architecture

micro-kiki is a multi-domain expert system built on Qwen3.6-35B-A3B, which achieves precise reasoning and continuous learning in professional fields through 35 LoRA adapters and a three-layer cognitive architecture (Aeon Memory, CAMP Negotiation, KnowBias Anti-bias). This article will introduce it from aspects such as background, architecture, training, and deployment.

2

Section 02

Project Background and Core Positioning

micro-kiki is the deployment result of the dreamOfkiki research project under Hypneum Lab, led by Clément Saillant. Its core goal is to build an AI system capable of handling 35 professional domains. The base model selected is Qwen3.6-35B-A3B, whose MoE architecture (256 experts with only 3 billion parameters activated) balances efficiency and capacity, supporting an ultra-long context of 262,000 tokens.

3

Section 03

Innovation of Three-Layer Cognitive Architecture

micro-kiki introduces a three-layer cognitive architecture:

  1. MetaRouter: The Sigmoid classifier supports multi-domain activation (up to 4 adapters), routes based on semantic features, and handles cross-domain problems;
  2. Aeon Memory System: A dual-storage architecture (Atlas semantic memory, Trace graph-structured memory) that maintains context coherence in multi-turn dialogues, with an average recall of over 36 times in 14 rounds of dialogue in actual tests;
  3. CAMP Negotiation and KnowBias Filtering: Coordinates multi-expert opinions to prevent groupthink, and ensures neutral and professional output through bias detection and framework deconstruction.
4

Section 04

Technical Details of LoRA Adapter Training

Optimal LoRA training configuration: 32 layers (out of 40 total), rank=16/alpha=16, learning rate 1e-5, 100-1000 iterations. Hardware requires Mac Studio M3 Ultra 512GB (BF16 training peak memory 107GB). Forgetting gate mechanism: Trigger rollback when the cosine similarity between the new adapter and existing adapters is <30 degrees and the win rate drops by more than 3% to prevent catastrophic forgetting.

5

Section 05

Verified Domain Coverage

Currently, 10 SFT domain adapters have been trained, with partial domain data as follows:

Domain Number of Training Samples Final Loss Typical Scenario
kicad-dsl 694 0.42 PCB design
spice-sim 368 0.38 Circuit simulation
stm32 711 0.44 Firmware development
electronics 1900 0.43 General electronic engineering
Among them, the four domains of SPICE, STM32, electronics, and DSP have passed the forgetting gate test, with good cross-domain compatibility.
6

Section 06

Deployment and Inference Solutions

Two deployment solutions are provided:

  • Mac Studio: MLX framework, Q4_K_M quantization, set memory/cache limits to avoid GPU suspension;
  • RTX4090: vLLM's AWQ quantization, 24GB memory can load the base model + 2-4 adapters, inference speed 30-50 tokens/second. Consumer-grade graphics cards are not recommended for training (requires over 100GB memory).
7

Section 07

Open Source Ecosystem and Related Projects

micro-kiki belongs to the Hypneum Lab ecosystem:

  • KIKI-Mac_tunner: Training execution and MLX pipeline;
  • nerve-wml: Neural protocol advisor bridging;
  • dream-of-kiki: Sister project for dream-like knowledge integration. The dataset (489K samples), lightweight version (4B), and full version (35B including adapters) models have been released on Hugging Face.
8

Section 08

Project Summary and Value

micro-kiki proves that through LoRA combination, intelligent routing, and cognitive architecture, deep coverage of multiple domains can be achieved on consumer-grade hardware. Its forgetting gate and bias filtering mechanisms provide methodologies for domain expert model development, which are of reference value to engineers and researchers in AI deployment in technical fields and are worth paying attention to and participating in.