Zing Forum

Reading

Steganography Without Modification: Hidden Communication via LLM Seeds

The study reveals a steganographic channel leveraging the inherent properties of LLM inference stacks: secret information is encoded via PRNG seeds, and receivers can reconstruct probability intervals from generated text to recover the seed. A 100% recovery rate is achievable within 300 tokens under known prompt settings.

隐写术LLM安全伪随机数生成器隐蔽通信确定性解码安全漏洞
Published 2026-06-08 15:32Recent activity 2026-06-09 11:54Estimated read 6 min
Steganography Without Modification: Hidden Communication via LLM Seeds
1

Section 01

Introduction: LLM Seed Steganography—Hidden Communication Without Modification

Key Findings: The study reveals a steganographic channel leveraging the inherent properties of LLM inference stacks, where secret information is encoded via PRNG seeds, and receivers can reconstruct probability intervals from generated text to recover the seed. Under known prompt settings, a 100% recovery rate can be achieved within 300 tokens. This channel does not require modifying model weights, sampling code, or output distributions—even standard LLM services could potentially be used for hidden communication.

2

Section 02

Background: Inherent Steganographic Channels Exist in LLM Inference Stacks

Original Authors and Source

  • Original Authors: Paper research team
  • Source Platform: arXiv
  • Original Title: Steganography Without Modification: Hidden Communication via LLM Seeds
  • Original Link: http://arxiv.org/abs/2606.09135v1
  • Publication Date: June 8, 2026

Security Alert

Widely deployed LLM inference stacks have inherent steganographic channels that can be exploited without modifying model weights, sampling code, or output distributions—meaning standard LLM services may be used for hidden communication.

3

Section 03

Technical Principles and Operational Modes

Core Principles

Leveraging structural features of deterministic decoding: The sequence of token-level probability intervals generated by PRNG in inverse transform sampling depends on the seed and can be reconstructed from the generated text.

Encoding and Decoding Process

  • Sender: Encode secret information into a PRNG seed, then generate text using standard sampling with this seed.
  • Receiver: Reconstruct probability intervals from the text, exhaustively search the seed space to recover the seed and extract the hidden payload.

Two Operational Modes

  • Known Prompt: Both parties share the prompt; the receiver can accurately reconstruct intervals, and forced alignment achieves perfect recovery.
  • Unknown Prompt: Use only the generated text; recover the seed via approximate interval reconstruction plus maximum hit count scoring.
4

Section 04

Experimental Evidence and Analysis of Influencing Factors

Experimental Results

  • Known Prompt: Tested across 6 model families and 5 text domains; 32-bit seeds are recovered from a 2^32 candidate space with 100% accuracy within 300 tokens, taking <35 seconds on a single GPU.
  • Unknown Prompt: Recovery accuracy approaches perfection at 600-800 tokens, taking approximately 12 seconds.

Influencing Factors

  • Prompt Strategy: Affects probability distribution and reconstruction accuracy
  • Tokenization Ambiguity: Introduces noise
  • Sampling Hyperparameters (temperature, top-p): Affect channel capacity and reliability
5

Section 05

Research Conclusions: Security Implications and Steganography Feasibility

  1. Steganographic transmission of 32-bit information is feasible, sufficient to deliver sensitive data such as key instructions and encryption keys.
  2. "Not knowing the prompt" is not a valid security assumption—hidden information can still be extracted even without the original prompt.
  3. Basic LLM components (e.g., PRNG) may become vectors for security attacks.
6

Section 06

Response Recommendations: Potential Mitigation Measures

Mitigation solutions for this steganographic channel:

  • Use unpredictable random seed sources
  • Add random noise to inference services
  • Monitor abnormal generation patterns
  • Adopt security-hardened inference stacks for sensitive applications
7

Section 07

Broader Impact: System Design and Research Directions

For LLM Service Providers

Need to consider steganography resistance during the system design phase.

For Security Researchers

Opens new directions: Designing and evaluating generative models resistant to steganography.

This study is not only a security vulnerability report but also a profound examination of the security boundaries of LLM systems.