Core Architecture Design
The neural database adopts a four-layer hybrid architecture, with each layer optimized for different data access patterns:
SPO Triple Storage Layer
The system uses Subject-Predicate-Object (SPO) triples as the underlying data model. This representation method, derived from semantic web technology, can flexibly express various relationships between entities without predefining rigid table structures. Triple storage allows data to expand naturally like a knowledge graph—new entities and relationships can be added at any time without affecting existing data.
Vector Embedding Index Layer
To enable semantic search, the system converts all text content into high-dimensional vector embeddings. These embeddings capture the semantic meaning of the text, allowing users to search for relevant data using natural language descriptions instead of precise keyword matching. The vector index is built based on OpenAI's embedding model and can handle complex semantic similarity queries.
Keyword Search Layer
In addition to semantic search, the system retains the traditional inverted index mechanism to support precise keyword matching. This hybrid retrieval strategy ensures users can enjoy the convenience of semantic understanding while being able to perform precise searches when needed. The two search modes can be used independently or combined to obtain more accurate results.
Text-to-SQL Generation Layer
The top layer is an intelligent query interface that can automatically convert users' natural language questions into executable query statements. Behind this is the code generation capability of large language models—after understanding the user's intent, the model generates corresponding triple queries or combined retrieval strategies.