Reading

KoRe: Injecting Interpretable External Knowledge into Large Language Models Using Compact Discrete Knowledge Tokens

To address the inherent flaws of parameterized knowledge storage in large language models, researchers propose the KoRe method, which encodes 1-hop subgraphs from knowledge graphs into compact discrete knowledge tokens and injects them into the model. It achieves competitive performance on three benchmark tests while reducing token usage by up to 10 times.

知识图谱大语言模型知识增强知识表示推理优化可解释AIRAG

Published 2026-05-20 01:53Recent activity 2026-05-20 10:54Estimated read 9 min

KoRe: Injecting Interpretable External Knowledge into Large Language Models Using Compact Discrete Knowledge Tokens

Section 01

KoRe Method Guide: Enhancing LLM Knowledge Capabilities with Compact Discrete Knowledge Tokens

To address the inherent flaws of parameterized knowledge storage in large language models (LLMs) (implicit encoding, difficulty in interpretation and debugging, high update costs, and susceptibility to hallucinations), researchers propose the KoRe method: encoding 1-hop subgraphs from knowledge graphs into compact discrete knowledge tokens and injecting them into the model's input sequence. This method requires no model training, is plug-and-play, achieves competitive performance on three benchmark tests, and reduces token usage by up to 10 times.

Section 02

Dilemmas of LLM Knowledge Storage and the Alternative Value of Knowledge Graphs

Dilemmas of Parameterized Knowledge Storage in LLMs

Lack of Interpretability: Knowledge is scattered in parameters as distributed representations, making it impossible to directly trace sources;
High Update Costs: New knowledge requires expensive fine-tuning or retraining;
Hallucination Problem: Generates incorrect content based on false correlations without awareness.

Alternative Solution with Knowledge Graphs

Knowledge graphs (KGs) store knowledge as explicit triples (e.g., "Einstein - Awarded - Nobel Prize"), with advantages of readability, easy verification, and editability. However, existing methods combining KGs with LLMs generally require extensive retraining or fine-tuning, limiting practicality.

Section 03

Core Methods and Knowledge Injection Mechanism of KoRe

Core Idea

Encode 1-hop subgraphs from KGs into compact discrete tokens and inject them into LLM inputs, enabling plug-and-play without training.

Reasons for Choosing 1-hop Subgraphs

Moderate information density: Contains key facts without excessive noise;
Simple and regular structure: Facilitates standardized encoding;
High retrieval efficiency: Suitable for online scenarios.

Discrete Knowledge Token Design

Entity encoding: Map entities to dedicated tokens (e.g., <ENT_Einstein>);
Relation encoding: Map relations to relation tokens (e.g., <REL_awarded>);
Subgraph serialization: Convert 1-hop subgraphs into linear sequences (e.g., <ENT_Einstein> <REL_awarded> <ENT_Nobel_Prize_in_Physics>);
Compact representation: Merge triples via templates and compression rules.

Knowledge Injection Mechanism

Adopt a prefix injection strategy, placing encoded knowledge tokens before user queries. Dynamic process: Entity recognition → Subgraph retrieval → Token encoding → Prefix injection.

Section 04

Performance Evaluation and Efficiency Advantages of KoRe

Accuracy Performance

LLMs equipped with KoRe achieve competitive performance compared to specially fine-tuned models on knowledge question-answering tasks. The improvement comes from optimized knowledge access (directly "reading" injected facts instead of "recalling").

Token Efficiency Improvement

Up to 10x improvement, reasons:

Structured compression: KG representations are more compact than natural language;
Redundancy removal: Eliminate redundant text information;
Precise injection: Only inject relevant subgraphs.

Comparison with RAG

Dimension	RAG	KoRe
Knowledge Source	Unstructured documents	Structured knowledge graphs
Representation Form	Original text fragments	Discrete knowledge tokens
Interpretability	Medium (need to read text)	High (structured triples)
Token Efficiency	Low (retains original text)	High (compact encoding)
Update Flexibility	Need to reindex documents	Directly edit KG

Section 05

Application Scenarios and Current Limitations of KoRe

Application Scenarios

Domain knowledge enhancement: Inject professional domain knowledge (e.g., medical, legal) into general LLMs without retraining;
Dynamic fact updates: Real-time model knowledge updates via KG updates (e.g., news, sports results);
Interpretable question-answering: Answers can be traced to specific triples in the KG.

Limitations

Coverage limitations: 1-hop subgraphs cannot support multi-hop reasoning problems;
Token design overhead: Predefined vocabulary and encoding rules are required, posing challenges for managing ultra-large-scale KGs;
Coupling with model capabilities: Relies on LLMs' in-context learning ability.

Section 06

Future Directions of KoRe and Insights into Knowledge Representation

Future Research Directions

Multi-hop subgraph encoding: Support complex reasoning;
Adaptive token learning: Reduce manual design overhead;
Hybrid knowledge fusion: Combine KoRe (structured) with RAG (unstructured);
Incremental update mechanism: Efficient KG index structures.

Insights into Knowledge Representation Paradigms

Trade-off between parameterization and explicitness: General language capabilities are parameterized, while specific facts are explicit;
Neuro-symbolic hybrid architecture: Neural networks handle language understanding and generation, KGs handle knowledge storage and reasoning;
Modular knowledge services: Knowledge as an independent service dynamically injected into downstream models.

Section 07

Value and Future Outlook of KoRe

KoRe provides a lightweight path for LLM knowledge enhancement—through clever representation design and lightweight injection, no expensive retraining is needed. In today's era of complex AI systems and diverse knowledge needs, such a flexible, efficient, and interpretable solution is increasingly important. With the maturity of KG technology and the expansion of LLM applications, KoRe is expected to become a standard bridge connecting structured knowledge and neural language models.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

Folkering OS: When the Operating System Itself Is AI—A Self-Evolving Bare-Metal Rust System

Folkering OS is the world's first AI-native bare-metal operating system, entirely written in Rust no_std without relying on Linux, POSIX, or libc. It can generate commands from scratch, compile them into WASM, and run them in 10 seconds, achieving true self-evolution.

Recent activity 2026-04-09 16:15