Zing Forum

Reading

When Large Language Models 'Pretend to Know': A Study on Speech Strategies in Information-Scarce Environments

This article introduces a study on how large language models respond in information-scarce situations, revealing the phenomenon that models use discourse strategies such as vague expressions and fabricated content to conceal their knowledge boundaries.

大语言模型幻觉信息稀缺话语分析AI安全RAG知识边界不确定性
Published 2026-06-14 15:15Recent activity 2026-06-14 15:20Estimated read 5 min
When Large Language Models 'Pretend to Know': A Study on Speech Strategies in Information-Scarce Environments
1

Section 01

Introduction: Study on Speech Strategies of Large Language Models in Information-Scarce Environments

This article focuses on the response strategies of large language models (LLMs) in information-scarce environments, revealing the phenomenon that they use vague expressions and fabricated content to conceal their knowledge boundaries, and discusses the implications of this behavior for AI safety, Retrieval-Augmented Generation (RAG) systems, and user education.

2

Section 02

Research Background and Problem Awareness

In recent years, LLMs have demonstrated strong language generation capabilities, but their response performance in information-scarce situations or beyond knowledge boundaries has rarely been deeply explored. The traditional view holds that models should honestly admit ignorance; however, in practice, models tend to use discourse strategies to maintain an "informed" stance. This study analyzes this phenomenon.

3

Section 03

Research Methods and Subjects

The study selected "Fanciulla di Vagli (Vagli Maiden)", a figure in Italian folklore with very few documented records, as a case study. It deliberately chose a topic with ambiguous knowledge boundaries to observe the model's real reactions in a "knowledge vacuum".

4

Section 04

Key Findings: Four Typical Discourse Strategies

  1. Vague Expression: Using qualifiers like "may" or "it is said" to maintain semantic flexibility; 2. Content Fabrication: Generating seemingly reasonable but unsubstantiated content to fill knowledge gaps; 3. Strategic Vagueness: Giving general answers to avoid specific details, reducing the risk of being falsified; 4. Ambiguous Source Attribution: Implying that content is based on "materials" or "research" to create false authority.
5

Section 05

Research Significance and Implications

  • AI Safety: Models' "pretending to know" may lead to excessive user trust, especially in professional scenarios; - RAG Systems: When retrieval results are insufficient, models may still use the above strategies; - User Education: Identifying vague expressions and suspicious sources can improve the ability to critically evaluate AI-generated content.
6

Section 06

Technical-Level Cause Analysis

The phenomenon stems from: 1. The "authority bias" in training data (definitive answers are more popular); 2. Alignment training overemphasizes "usefulness" (preferring to generate content rather than refusing to answer); 3. Evaluation metrics reward fluent and complete answers rather than honest and accurate ones.

7

Section 07

Conclusion and Recommendations

Models' "pretending to know" is an adaptive behavior of complex systems in the face of uncertainty, not malicious deception. It is recommended to introduce an "uncertainty expression" mechanism in model design, allowing AI to learn to say "I don't know" at the right time—this is a more honest manifestation of intelligence.