1. Wake Word Detection
Supports custom wake words (e.g., 'Jarvis'). OpenWakeWord enables offline low-latency wake-up without network connection for activation.
2. Real-Time Speech-to-Text
Uses the Whisper model for real-time speech-to-text, including an 'ambient transcription' feature. Background continuous recording and transcription support daily activity summaries.
3. Large Language Model Integration
Supports multiple providers (OpenAI, HuggingFace Pipeline/Endpoint, Llama.cpp); locally supports quantized models like Llama3 and Mistral7B; can remotely connect to HuggingFace inference endpoints; parameter management via JSON configuration.
4. Semantic Search and OpenRecall Integration
Regularly takes screenshots to index user activities, enabling semantic historical record retrieval (e.g., querying 'the interface researched at 2 PM').
5. Text-to-Speech
Piper enables offline TTS to generate natural voice responses.
6. MCP Support
Connects to external MCP servers to expand capabilities, supports local (stdio) and remote (HTTP) servers, dynamic tool loading and authentication.