Reading

TAHOE: A Text-to-SQL System that Automatically Learns Prompt Optimization from Experience

TAHOE uses an error-driven prompt learning process to integrate debugging traces into a structured prompt bank, combining syntactic and semantic prompts. It significantly improves Text-to-SQL performance without updating model parameters—on Spider 2.0-Snow, it increases GPT-5.5's pass rate from 61.95% to 79.42%.

Text-to-SQL提示优化大语言模型数据库查询自然语言处理提示工程Spider数据集

Published 2026-06-11 01:52Recent activity 2026-06-11 10:21Estimated read 5 min

TAHOE: A Text-to-SQL System that Automatically Learns Prompt Optimization from Experience

Section 01

TAHOE: Automated Hint Optimization for Text-to-SQL

TAHOE is a Text-to-SQL system that optimizes prompts via error-driven learning, building a structured hint bank combining syntax and semantic hints. It improves performance without updating model parameters—e.g., GPT-5.5's pass rate on Spider 2.0-Snow rises from 61.95% to 79.42%. The system treats prompt optimization as a dynamic data management problem rather than static template engineering.

Section 02

Background: Challenges in Text-to-SQL

Text-to-SQL faces real-world issues: strict SQL dialect requirements (Snowflake, PostgreSQL, etc.), large enterprise schema handling (hundreds of tables/columns), diverse user preferences. Existing solutions have limitations: supervised fine-tuning (SFT) is costly and inflexible; test-time scaling is effective but computationally expensive and hard to scale.

Section 03

TAHOE's Core Idea & Workflow

TAHOE's core is dynamic data management for prompt optimization, building a hint bank from experience. Its workflow has two phases: development (learn from debug traces to build initial hint bank) and deployment (update via user feedback, focus on development here). Hints are derived from feedback: compiler feedback → syntax hints (dialect rules); execution/user feedback → semantic hints (schema/user logic).

Section 04

TAHOE System Architecture

Key components: 1. Error-driven learning flow: generate SQL → validate (syntax/semantic) → analyze errors → generate hints → integrate/deduplicate into hint bank. 2. Strategy layer: handles intent conflicts using recency signals and success stats. 3. Inference steps: logic planning (identify query intent) → hint retrieval (based on entities/query type/user history) → SQL synthesis (combine hints with problem for LLM).

Section 05

Experimental Results & Analysis

Tested on Spider 2.0-Snow (Snowflake dialect) with GPT-5.5 and Doubao-2.0-lite. Results: GPT-5.5 pass rate up by 17.47% (61.95%→79.42%), 100% Snowflake syntax compliance, average compiler feedback rounds reduced from 2.79 to 0.12. Cross-model transfer: Doubao gets +19.7% pass rate. Ablation analysis confirms contributions of syntax hints (compliance), semantic hints (complex queries), strategy layer (robustness), and relevance-based retrieval.

Section 06

Practical Implications for LLM Applications

For Text-to-SQL: hint engineering often outperforms fine-tuning (lower cost, more flexible); error-driven learning is key for continuous improvement; layered hint design (syntax+semantic) balances compliance and logic. For general LLM apps: experience learning from debug traces; dynamic hint management (not static templates); feedback loops enable self-improvement.

Section 07

Limitations & Future Directions

Current focus on development phase (deployment feedback mechanism to be added). Future work: online learning (real-time user feedback integration), multi-dialect support, handling complex queries (subqueries, window functions, CTEs), security (prevent prompt/SQL injection).

Section 08

Summary of TAHOE's Contributions

TAHOE's core contributions: systematic error-driven hint learning flow, layered hint architecture (syntax+semantic+strategy). It significantly improves Text-to-SQL performance without model parameter updates, demonstrating the potential of prompt engineering and providing a feasible path for LLM integration into database applications.

Continue Reading

Keep going with more reads from the same topic.

Nornir MCP Server: An Enterprise-Grade Bridge for Integrating Large Language Models into Network Automation

Nornir MCP Server is an enterprise-level server based on the Model Context Protocol (MCP). It seamlessly integrates large language models (such as Claude) with the Nornir network automation framework, supporting natural language orchestration for multi-vendor network devices (Cisco, Arista, Juniper, etc.), and providing production-grade features like a dual-engine architecture (NAPALM + Netmiko), intelligent filtering, and a secure sandbox.

Recent activity 2026-05-06 20:51

Bibliothèque Française LLM: A French Public Domain Literature Index System Optimized for Large Language Models

Bibliothèque Française LLM is a structured indexing and annotation project for French public domain literature designed specifically for large language models (LLMs). It integrates multiple authoritative sources such as DraCor, Common Corpus, and Wikisource, providing metadata indexing categorized by genre, author, and era, as well as in-depth annotations for dramatic texts (including characters, lines, stage directions, etc.). Its aim is to enable LLMs to efficiently read and understand classic French literary works.

Recent activity 2026-05-06 20:50

Splinter: A Lock-Free Zero-Copy Shared Memory KV and Vector Storage Library That Eliminates Socket and Memcpy Overhead for LLM Inference

Splinter is a minimalist, high-performance key-value (KV) and vector storage system enabling zero-latency inter-process communication via shared memory and atomic operations. With only 766 lines of core code, it supports millions of operations per second and 768-dimensional vector storage, offering a new architectural approach for local LLM inference and data-intensive applications.

Recent activity 2026-04-03 08:49

libmlxforge: An Embedded MLX LLM Inference Engine for Apple Silicon

libmlxforge is an embeddable MLX large language model (LLM) inference engine designed specifically for Apple Silicon. It provides a unified C ABI interface, supports calls from Node.js, Swift, and Rust, and features continuous batching, streaming output, JSON-constrained structured output, and embedding vector generation.

Recent activity 2026-06-09 17:23