# KoRe: Injecting Interpretable External Knowledge into Large Language Models Using Compact Discrete Knowledge Tokens

> To address the inherent flaws of parameterized knowledge storage in large language models, researchers propose the KoRe method, which encodes 1-hop subgraphs from knowledge graphs into compact discrete knowledge tokens and injects them into the model. It achieves competitive performance on three benchmark tests while reducing token usage by up to 10 times.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-19T17:53:29.000Z
- 最近活动: 2026-05-20T02:54:31.153Z
- 热度: 140.0
- 关键词: 知识图谱, 大语言模型, 知识增强, 知识表示, 推理优化, 可解释AI, RAG
- 页面链接: https://www.zingnex.cn/en/forum/thread/kore-a33b4145
- Canonical: https://www.zingnex.cn/forum/thread/kore-a33b4145
- Markdown 来源: floors_fallback

---

## KoRe Method Guide: Enhancing LLM Knowledge Capabilities with Compact Discrete Knowledge Tokens

To address the inherent flaws of parameterized knowledge storage in large language models (LLMs) (implicit encoding, difficulty in interpretation and debugging, high update costs, and susceptibility to hallucinations), researchers propose the KoRe method: encoding 1-hop subgraphs from knowledge graphs into compact discrete knowledge tokens and injecting them into the model's input sequence. This method requires no model training, is plug-and-play, achieves competitive performance on three benchmark tests, and reduces token usage by up to 10 times.

## Dilemmas of LLM Knowledge Storage and the Alternative Value of Knowledge Graphs

### Dilemmas of Parameterized Knowledge Storage in LLMs
1. **Lack of Interpretability**: Knowledge is scattered in parameters as distributed representations, making it impossible to directly trace sources;
2. **High Update Costs**: New knowledge requires expensive fine-tuning or retraining;
3. **Hallucination Problem**: Generates incorrect content based on false correlations without awareness.

### Alternative Solution with Knowledge Graphs
Knowledge graphs (KGs) store knowledge as explicit triples (e.g., "Einstein - Awarded - Nobel Prize"), with advantages of readability, easy verification, and editability. However, existing methods combining KGs with LLMs generally require extensive retraining or fine-tuning, limiting practicality.

## Core Methods and Knowledge Injection Mechanism of KoRe

### Core Idea
Encode 1-hop subgraphs from KGs into compact discrete tokens and inject them into LLM inputs, enabling plug-and-play without training.

### Reasons for Choosing 1-hop Subgraphs
- Moderate information density: Contains key facts without excessive noise;
- Simple and regular structure: Facilitates standardized encoding;
- High retrieval efficiency: Suitable for online scenarios.

### Discrete Knowledge Token Design
1. Entity encoding: Map entities to dedicated tokens (e.g., `<ENT_Einstein>`);
2. Relation encoding: Map relations to relation tokens (e.g., `<REL_awarded>`);
3. Subgraph serialization: Convert 1-hop subgraphs into linear sequences (e.g., `<ENT_Einstein> <REL_awarded> <ENT_Nobel_Prize_in_Physics>`);
4. Compact representation: Merge triples via templates and compression rules.

### Knowledge Injection Mechanism
Adopt a prefix injection strategy, placing encoded knowledge tokens before user queries. Dynamic process: Entity recognition → Subgraph retrieval → Token encoding → Prefix injection.

## Performance Evaluation and Efficiency Advantages of KoRe

### Accuracy Performance
LLMs equipped with KoRe achieve competitive performance compared to specially fine-tuned models on knowledge question-answering tasks. The improvement comes from optimized knowledge access (directly "reading" injected facts instead of "recalling").

### Token Efficiency Improvement
Up to 10x improvement, reasons:
- Structured compression: KG representations are more compact than natural language;
- Redundancy removal: Eliminate redundant text information;
- Precise injection: Only inject relevant subgraphs.

### Comparison with RAG
| Dimension | RAG | KoRe |
|------|-----|------|
| Knowledge Source | Unstructured documents | Structured knowledge graphs |
| Representation Form | Original text fragments | Discrete knowledge tokens |
| Interpretability | Medium (need to read text) | High (structured triples) | 
| Token Efficiency | Low (retains original text) | High (compact encoding) |
| Update Flexibility | Need to reindex documents | Directly edit KG |

## Application Scenarios and Current Limitations of KoRe

### Application Scenarios
1. **Domain knowledge enhancement**: Inject professional domain knowledge (e.g., medical, legal) into general LLMs without retraining;
2. **Dynamic fact updates**: Real-time model knowledge updates via KG updates (e.g., news, sports results);
3. **Interpretable question-answering**: Answers can be traced to specific triples in the KG.

### Limitations
1. **Coverage limitations**: 1-hop subgraphs cannot support multi-hop reasoning problems;
2. **Token design overhead**: Predefined vocabulary and encoding rules are required, posing challenges for managing ultra-large-scale KGs;
3. **Coupling with model capabilities**: Relies on LLMs' in-context learning ability.

## Future Directions of KoRe and Insights into Knowledge Representation

### Future Research Directions
1. Multi-hop subgraph encoding: Support complex reasoning;
2. Adaptive token learning: Reduce manual design overhead;
3. Hybrid knowledge fusion: Combine KoRe (structured) with RAG (unstructured);
4. Incremental update mechanism: Efficient KG index structures.

### Insights into Knowledge Representation Paradigms
1. **Trade-off between parameterization and explicitness**: General language capabilities are parameterized, while specific facts are explicit;
2. **Neuro-symbolic hybrid architecture**: Neural networks handle language understanding and generation, KGs handle knowledge storage and reasoning;
3. **Modular knowledge services**: Knowledge as an independent service dynamically injected into downstream models.

## Value and Future Outlook of KoRe

KoRe provides a lightweight path for LLM knowledge enhancement—through clever representation design and lightweight injection, no expensive retraining is needed. In today's era of complex AI systems and diverse knowledge needs, such a flexible, efficient, and interpretable solution is increasingly important. With the maturity of KG technology and the expansion of LLM applications, KoRe is expected to become a standard bridge connecting structured knowledge and neural language models.
