# Janus: A Study on Side-Channel Attacks Against Sparse Attention LLM Inference

> The Janus project reveals a new type of security vulnerability introduced by sparse attention mechanisms in large language model (LLM) inference. By analyzing Sparse Induced Memory Access (SIMA) traces, attackers can infer sensitive attributes of user queries and recover the response content generated by the model without accessing model parameters or API outputs.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-23T16:40:59.000Z
- 最近活动: 2026-04-23T16:51:03.997Z
- 热度: 150.8
- 关键词: 稀疏注意力, 侧信道攻击, LLM安全, 隐私保护, 内存访问, SIMA, 推理安全, 模型隐私
- 页面链接: https://www.zingnex.cn/en/forum/thread/janus-llm
- Canonical: https://www.zingnex.cn/forum/thread/janus-llm
- Markdown 来源: floors_fallback

---

## Janus: A Study on Side-Channel Attacks Against Sparse Attention LLM Inference (Introduction)

The Janus project reveals a new type of security vulnerability introduced by sparse attention mechanisms in large language model (LLM) inference. By analyzing Sparse Induced Memory Access (SIMA) traces, attackers can infer sensitive attributes of user queries and recover the response content generated by the model without accessing model parameters or API outputs. This thread will introduce the research background, attack methods, implementation details, security impacts, and defense suggestions in separate floors.

## Research Background

The inference efficiency of large language models is a focus of attention in academia and industry. Sparse attention mechanisms improve efficiency by skipping unimportant attention calculations, but they may introduce new security vulnerabilities. The Janus project conducts a systematic study on this issue and reveals the risk of sparse attention mechanisms being maliciously exploited for side-channel attacks.

## Attack Principles and Types

The core of the Janus attack leverages **Sparse Induced Memory Access (SIMA)** traces generated by sparse attention: during inference, the model dynamically selects tokens to participate in calculations, leaving memory access traces. Attackers monitor these traces to reconstruct sparse patterns and then infer input features. The attack does not require model parameters, activation values, or API outputs—only hardware-level traces. Main attack types: 1. Query Attribute Inference (QAI) in the prefill phase: analyze prefill SIMA traces and use a pre-trained MLP classifier to infer sensitive attributes (e.g., disease categories in medical queries); 2. Autoregressive Token Recovery (ATR) in the decoding phase: monitor decoding phase traces to gradually recover the response content generated by the model.

## Technical Implementation and Evidence

Janus provides complete attack code, including two modules: 1. Prefill attribute inference module: contains sparse patterns of 20 verification queries, a pre-trained attribute predictor, inference scripts, and result files; 2. Decoding token recovery module: contains sparse patterns of the decoding phase, a pre-trained token predictor, inference scripts, and recovery results. The decoding phase attack verified 20 queries (each generating 300 tokens, vocabulary size 1758) and achieved token-by-token recovery.

## Security Impacts

The Janus attack poses serious privacy risks: 1. Query content inference: can obtain sensitive attributes of users (e.g., medical conditions); 2. Response content recovery: reconstruct the complete response of the model; 3. Multi-tenant risk: users can snoop on each other's inference processes in shared hardware scenarios. Attack scenarios include cloud inference services, edge devices, and Model-as-a-Service (MaaS).

## Defense Suggestions

Against the Janus attack, defense can be implemented at multiple levels: Hardware level: memory access isolation, cache partitioning, constant-time sparse attention algorithms; Software level: sparse pattern obfuscation, memory access randomization, security auditing; Architecture level: Trusted Execution Environment (TEE), homomorphic encryption inference, federated inference.

## Research Summary

The Janus project reveals the potential security risks of sparse attention optimization, reminding us to fully consider security when pursuing inference efficiency. As LLMs are increasingly applied in sensitive fields, understanding and mitigating such side-channel attacks is crucial. It is necessary to find a balance between security and efficiency to ensure the privacy protection capability of LLM services.
