Zing Forum

Reading

Social Draft: On-Device AI Social Reply Assistant, Making Every Response Just Right

An iOS on-device-first social reply assistant app that helps users find natural, thoughtful responses in awkward moments through local large model inference, distilled SFT training, and user-controllable LoRA personalization.

端侧AI社交助手LoRA本地LLMiOS应用隐私保护SwiftUIllama.cppAI回复建议个性化模型
Published 2026-04-26 15:15Recent activity 2026-04-26 15:21Estimated read 6 min
Social Draft: On-Device AI Social Reply Assistant, Making Every Response Just Right
1

Section 01

Introduction: Social Draft—On-Device AI Social Reply Assistant, Making Every Response Just Right

Social Draft is an iOS on-device-first social reply assistant app, core to solving the awkward anxiety users feel when not knowing how to respond to messages. It positions itself as a "reply advisor" rather than a chatbot, with the final reply decision always in the user's hands. Through local large model inference, distilled SFT training, and user-controllable LoRA personalization, it provides natural and thoughtful reply suggestions. Meanwhile, its on-device-first design ensures privacy—data never leaves the device in local mode.

2

Section 02

Background: Pain Points of Social Replies and Product Positioning

The product targets common awkward scenarios in real conversations: declining invitations when tired, politely refusing unwanted events, appropriate responses in emotional dialogues, concise professional replies for work/school, difficulty expressing oneself, etc. Unlike AI products on the market that "speak for users", Social Draft chooses to "help users speak", ensuring users retain the final decision on replies.

3

Section 03

Methodology: On-Device-First Technical Architecture and Personalization Support

It uses a three-layer backend architecture: Mock (default safe mode, no model/API required), Cloud (calls APIs like OpenAI/Anthropic/Gemini), Local (on-device GGUF model, privacy-first). The Local mode is core—no network needed, data runs locally. It supports users training LoRA adapters locally to achieve style personalization, privacy protection, and lightweight adaptation. Currently supports Llama-3.2-1B/3B-Instruct-Q4_K_M and the reply_sft_lora_v1 adapter.

4

Section 04

Evidence: Core Features Enable Natural and Appropriate Responses

Provides multiple styles of smart suggestion cards (natural, direct, friendly, thoughtful, decision-oriented); Ghost Text inline completion feature (runs locally to speed up replies); context awareness (reads recent conversations to understand tone and topic); reply target selection (generates suggestions for specific messages).

5

Section 05

Research Support: Complete Workflow from Data to Model

Includes a two-stage Claude distillation process to generate synthetic social reply datasets; provides complete LoRA/SFT training notebooks to support users training personalized models with their own dialogue data; tools in the Experiments_Benchmarks directory can compare reply quality between local and cloud models.

6

Section 06

Technical Implementation: iOS Architecture and Multi-Backend Support

The app uses SwiftUI to build a layered architecture (AppRootView, Features/Chat/Settings, etc.); local inference is implemented via llama.xcframework (loading GGUF models, mounting LoRA, sampling chain optimization); cloud supports OpenAI/Anthropic/Groq, etc.; Supabase integration is used for data persistence of demo chat services (thread management, real-time events, etc.).

7

Section 07

Privacy and Ethics: On-Device-First Design Protects User Rights

Core functions run locally; users control reply rights; three modes are transparently distinguished; only necessary context is sent to the cloud. It avoids the ethical risk of AI replacing social interaction, positioning itself as an auxiliary tool to help users express themselves better.

8

Section 08

Conclusion and Recommendations: Responsible Practice of AI-Assisted Social Interaction

Social Draft demonstrates the transformation of large model technology into a product that solves user pain points. Its advantages include on-device priority, LoRA personalization, and emphasis on privacy. For users, it alleviates social anxiety; for developers, it is a worthwhile on-device AI project to study, including complete application implementation and research workflow.