# Commonsense-Driven Transformer Fine-Tuning: Enabling LLMs to Generate More Coherent Stories

> An NLP and generative AI system that uses LoRA technology to fine-tune three large language models, integrates commonsense reasoning capabilities for short story generation, and is trained on the ROCStories dataset with evaluation metrics including BLEU, ROUGE, BERTScore, and perplexity.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-12T22:41:36.000Z
- 最近活动: 2026-06-12T22:57:24.879Z
- 热度: 159.7
- 关键词: 大语言模型, LoRA微调, 常识推理, 故事生成, Transformer, 生成式AI, NLP, ROCStories
- 页面链接: https://www.zingnex.cn/en/forum/thread/transformer-ec228717
- Canonical: https://www.zingnex.cn/forum/thread/transformer-ec228717
- Markdown 来源: floors_fallback

---

## Introduction: Commonsense-Driven Transformer Fine-Tuning Improves Story Generation Coherence

**Original Author/Maintainer**: nithin-jella
**Source Platform**: GitHub
**Original Title**: Commonsense-Driven-Fine-Tuning-of-Transformer-Models-for-Coherent-Story-Generation
**Original Link**: https://github.com/nithin-jella/Commonsense-Driven-Fine-Tuning-of-Transformer-Models-for-Coherent-Story-Generation
**Publication Time**: 2026-06-12

This project addresses issues like logical breaks and commonsense violations when large language models generate long stories. It proposes fine-tuning three large language models of different architectures using LoRA technology, integrating commonsense reasoning capabilities, training on the ROCStories dataset, and evaluating with metrics such as BLEU, ROUGE, BERTScore, and perplexity, aiming to generate more coherent and reasonable short stories.

## Research Background and Motivation

Large language models excel in text generation, but when generating long coherent stories, they often have issues like logical breaks, unreasonable plots, and character behaviors that violate commonsense. The root cause is that models mainly learn surface statistical patterns of text and lack deep causal logic and commonsense knowledge. For example, a model might generate sentences like "Xiao Ming put ice cubes into hot tea, and the ice cubes became larger" which violates physical commonsense. This project aims to solve these problems and improve the coherence and rationality of story generation by injecting commonsense reasoning capabilities.

## Technical Solution: LoRA Fine-Tuning and Commonsense Integration

### Model Selection and Fine-Tuning Strategy
Three representative large language models (different architectures and scales) are selected for comparison to verify the generality of the method. LoRA (Low-Rank Adaptation) is used for parameter-efficient fine-tuning, with advantages including high computational efficiency (only training a small number of low-rank matrices), low storage cost, low inference overhead, and avoiding catastrophic forgetting.

### Commonsense Reasoning Integration
- **Source of Commonsense Knowledge**: Use existing commonsense knowledge bases and reasoning datasets to provide prior knowledge such as physics, social norms, and causal relationships.
- **Training Data**: Build training samples based on the ROCStories dataset (five-sentence short stories, manually verified to conform to commonsense).
- **Loss Function**: On top of the standard language modeling loss, an auxiliary loss for commonsense consistency may be introduced to achieve multi-objective optimization.

## Evaluation System: Multi-Dimensional Measurement of Generation Quality

### Automatic Evaluation Metrics
- **BLEU**: Measures the n-gram overlap between generated text and reference text, reflecting lexical similarity.
- **ROUGE**: Focuses on recall; ROUGE-L captures text fluency and structural similarity.
- **BERTScore**: Semantic similarity based on pre-trained model embeddings, close to human judgment.
- **Perplexity**: Reflects the model's confidence in generated content; lower perplexity means better fluency and grammatical correctness.

### Commonsense Consistency Evaluation
- **Human Evaluation**: Human judges assess logical rationality and commonsense compliance.
- **Adversarial Testing**: Design test cases to check if commonsense-violating content is avoided.
- **Comparative Experiments**: Compare with baseline models without commonsense enhancement.

## Experimental Results and Key Findings

Although there are no specific values, the following can be inferred:
1. **Commonsense Enhancement is Effective**: After fine-tuning, the model maintains language fluency and significantly improves logical consistency.
2. **LoRA Applicability**: Verify the effectiveness of LoRA in commonsense reasoning tasks, lowering the threshold for experiments.
3. **Multi-Model Comparison**: Analyze the relationship between model architecture, scale, and commonsense reasoning ability, providing references for future optimization.

## Application Scenarios and Potential Value

- **Creative Writing Assistance**: Provide AI assistants for authors to generate logically reasonable story frameworks, plot twists, etc.
- **Educational Content Generation**: Automatically generate educational stories that conform to scientific commonsense, supporting large-scale production of personalized learning materials.
- **Dialogue System Enhancement**: Improve the long-text generation ability of chatbots and maintain logical consistency.
- **Game Narrative Design**: Generate dynamic plots for open-world games, ensuring consistency in NPC behaviors and physical rules.

## Limitations and Future Directions

- **Commonsense Coverage**: Current knowledge bases mainly cover physics and social norms; professional domain knowledge is limited, so breadth and depth need to be expanded.
- **Cultural Differences**: Commonsense is culturally relative, so adaptation to multilingual and multicultural scenarios is needed.
- **Computational Efficiency**: Inference still requires high resources; practicality can be improved through model compression, quantization, etc.
- **Evaluation Challenges**: There is a gap between automatic metrics and human judgment; better automatic evaluation methods for commonsense consistency need to be developed.

## Summary and Insights

This project represents an important attempt in the NLG field to evolve toward a more intelligent and rational direction, emphasizing that language models need to generate content that conforms to real-world logic. Its technical route (parameter-efficient fine-tuning with commonsense injection) provides a feasible path for researchers with limited resources. Insights for developers: Large model applications need to be optimized for specific needs (such as commonsense consistency), and the "general foundation + specialized enhancement" may become the mainstream paradigm. In the future, multimodal and world model technologies are expected to further improve commonsense reasoning capabilities and enable more intelligent AI story generation.