# Writing Finetuner: Fine-tuning Mistral 7B on Personal Writing Corpus to Verify if the Model Truly Learns Your Way of Thinking

> A complete personal writing style fine-tuning project that not only focuses on loss reduction but also uses a unique reasoning probe to evaluate whether the model truly learns your way of thinking, rather than just surface patterns.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-06-02T10:44:02.000Z
- 最近活动: 2026-06-02T10:49:31.886Z
- 热度: 141.9
- 关键词: 大模型微调, LoRA, Mistral 7B, 个性化AI, 推理评估, 语料准备, 参数高效微调, 写作风格迁移
- 页面链接: https://www.zingnex.cn/en/forum/thread/writing-finetuner-mistral-7b
- Canonical: https://www.zingnex.cn/forum/thread/writing-finetuner-mistral-7b
- Markdown 来源: floors_fallback

---

## Writing Finetuner: Using Reasoning Probes to Verify if Mistral7B Learns Your Way of Thinking

Writing Finetuner is an open-source project for fine-tuning Mistral7B on personal writing corpora. Its core goal is to address the problem that traditional fine-tuning only focuses on surface metrics like loss and perplexity. It uses innovative reasoning probes to evaluate whether the model truly learns the user's way of thinking, rather than just imitating surface wording habits. The project is maintained by satyam671 and was released on GitHub on June 2, 2026.

## Project Background: Limitations of Traditional Fine-tuning Evaluation

The project originated from the author's six-month personal experiment: although the fine-tuned model could imitate writing styles, its reasoning process in new scenarios was significantly different from real thinking. This reveals that standard metrics (such as perplexity and ROUGE-L) can only measure surface text similarity and cannot capture deep reasoning patterns. The project aims to provide a more comprehensive evaluation method to determine whether the model truly learns the user's thinking.

## Project Methodology: End-to-End Fine-tuning and Evaluation Process

The project provides an end-to-end process with five stages: 1. Corpus preparation (supports Medium HTML, txt, md; automatic cleaning and segmentation); 2. LoRA fine-tuning (parameter-efficient, keeping base model weights unchanged); 3. Standard evaluation (calculating perplexity and ROUGE-L); 4. Reasoning probe (core innovation, evaluating reasoning ability in new scenarios); 5. Deep scoring (structural similarity, reasoning depth, viewpoint consistency). Hardware requirements: LoRA (bf16) requires 24GB VRAM (e.g., RTX3090), fp16 requires 20GB, and with gradient checkpointing enabled, about 16GB; three rounds of training on a 400,000-word corpus take 18-20 hours.

## Core Innovation: Reasoning Probe Evaluates the Model's Real Thinking

The reasoning probe is a feature of the project, with the design philosophy that 'true understanding is reflected in reasoning ability in new scenarios'. Process: 1. Present an unseen question to the user and the model; 2. The user writes an answer; 3. The model generates an answer; 4. The user blindly evaluates both (to avoid confirmation bias). Example question: 'Diagnostic process for ML models that perform well in testing but degrade in production'. Scoring dimensions: structural similarity, reasoning depth, viewpoint consistency.

## Practical Guide and Training Result Examples

Quick start steps: 1. Clone the repository and install dependencies; 2. Place the corpus in data/raw and run the cleaning and chunking script; 3. Configure train_config.yaml; 4. Train; 5. Run the reasoning probe. Typical training results: After three rounds of training, train_loss decreased from 2.143 to 1.203, eval_loss from 2.198 to 1.289, perplexity from 42.3 to 12.1, and ROUGE-L from 0.38 to 0.61. However, the reasoning probe may reveal that the model does not truly understand the way of thinking.

## Conclusion and Insights: Personalized AI Needs to Focus on Deep Thinking

Project insights: Fine-tuning evaluation cannot rely solely on automated metrics; true personalization requires the model to understand the user's thinking rather than copying surface patterns. Summary: Writing Finetuner provides complete tools and methodology, which is of great reference value for users fine-tuning large models on personal data. It is an open-source project that combines technical implementation with profound insights.
