# Social Draft: On-Device AI Social Reply Assistant, Making Every Response Just Right

> An iOS on-device-first social reply assistant app that helps users find natural, thoughtful responses in awkward moments through local large model inference, distilled SFT training, and user-controllable LoRA personalization.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-26T07:15:19.000Z
- 最近活动: 2026-04-26T07:21:12.135Z
- 热度: 163.9
- 关键词: 端侧AI, 社交助手, LoRA, 本地LLM, iOS应用, 隐私保护, SwiftUI, llama.cpp, AI回复建议, 个性化模型
- 页面链接: https://www.zingnex.cn/en/forum/thread/social-draft-ai
- Canonical: https://www.zingnex.cn/forum/thread/social-draft-ai
- Markdown 来源: floors_fallback

---

## Introduction: Social Draft—On-Device AI Social Reply Assistant, Making Every Response Just Right

Social Draft is an iOS on-device-first social reply assistant app, core to solving the awkward anxiety users feel when not knowing how to respond to messages. It positions itself as a "reply advisor" rather than a chatbot, with the final reply decision always in the user's hands. Through local large model inference, distilled SFT training, and user-controllable LoRA personalization, it provides natural and thoughtful reply suggestions. Meanwhile, its on-device-first design ensures privacy—data never leaves the device in local mode.

## Background: Pain Points of Social Replies and Product Positioning

The product targets common awkward scenarios in real conversations: declining invitations when tired, politely refusing unwanted events, appropriate responses in emotional dialogues, concise professional replies for work/school, difficulty expressing oneself, etc. Unlike AI products on the market that "speak for users", Social Draft chooses to "help users speak", ensuring users retain the final decision on replies.

## Methodology: On-Device-First Technical Architecture and Personalization Support

It uses a three-layer backend architecture: Mock (default safe mode, no model/API required), Cloud (calls APIs like OpenAI/Anthropic/Gemini), Local (on-device GGUF model, privacy-first). The Local mode is core—no network needed, data runs locally. It supports users training LoRA adapters locally to achieve style personalization, privacy protection, and lightweight adaptation. Currently supports Llama-3.2-1B/3B-Instruct-Q4_K_M and the reply_sft_lora_v1 adapter.

## Evidence: Core Features Enable Natural and Appropriate Responses

Provides multiple styles of smart suggestion cards (natural, direct, friendly, thoughtful, decision-oriented); Ghost Text inline completion feature (runs locally to speed up replies); context awareness (reads recent conversations to understand tone and topic); reply target selection (generates suggestions for specific messages).

## Research Support: Complete Workflow from Data to Model

Includes a two-stage Claude distillation process to generate synthetic social reply datasets; provides complete LoRA/SFT training notebooks to support users training personalized models with their own dialogue data; tools in the Experiments_Benchmarks directory can compare reply quality between local and cloud models.

## Technical Implementation: iOS Architecture and Multi-Backend Support

The app uses SwiftUI to build a layered architecture (AppRootView, Features/Chat/Settings, etc.); local inference is implemented via llama.xcframework (loading GGUF models, mounting LoRA, sampling chain optimization); cloud supports OpenAI/Anthropic/Groq, etc.; Supabase integration is used for data persistence of demo chat services (thread management, real-time events, etc.).

## Privacy and Ethics: On-Device-First Design Protects User Rights

Core functions run locally; users control reply rights; three modes are transparently distinguished; only necessary context is sent to the cloud. It avoids the ethical risk of AI replacing social interaction, positioning itself as an auxiliary tool to help users express themselves better.

## Conclusion and Recommendations: Responsible Practice of AI-Assisted Social Interaction

Social Draft demonstrates the transformation of large model technology into a product that solves user pain points. Its advantages include on-device priority, LoRA personalization, and emphasis on privacy. For users, it alleviates social anxiety; for developers, it is a worthwhile on-device AI project to study, including complete application implementation and research workflow.
