# Voxel: An End-to-End Simulation Framework for 3D Stacked AI Chip Architectures

> This article introduces Voxel, a fast end-to-end simulation framework for 3D stacked AI chips. The framework supports software/hardware co-exploration, allows custom model execution plans via ML compilers, comprehensively analyzes the impact of computing paradigms, mapping strategies, interconnection topologies, etc., on the efficiency of 3D stacked chips, and provides important insights for the design of next-generation AI chips.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-04-29T15:48:46.000Z
- 最近活动: 2026-04-30T02:26:29.924Z
- 热度: 140.4
- 关键词: 3D堆叠芯片, AI芯片架构, LLM推理, 内存带宽, 芯片仿真, TSV, 片上网络, 编译器优化
- 页面链接: https://www.zingnex.cn/en/forum/thread/voxel-3dai
- Canonical: https://www.zingnex.cn/forum/thread/voxel-3dai
- Markdown 来源: floors_fallback

---

## Introduction: Voxel—An End-to-End Simulation Framework for 3D Stacked AI Chip Architectures

This article introduces Voxel, a fast end-to-end simulation framework for 3D stacked AI chips. The framework supports software/hardware co-exploration, allows custom model execution plans via ML compilers, comprehensively analyzes the impact of computing paradigms, mapping strategies, interconnection topologies, etc., on the efficiency of 3D stacked chips, and provides important insights for the design of next-generation AI chips.

## Background: Memory Wall Challenges of AI Chips and Exploration of 3D Stacking

The rapid development of Large Language Models (LLMs) has brought severe memory bottlenecks. Computing units are idle while waiting for data, i.e., the "memory wall" restricts LLM inference performance. In traditional 2D chips, memory and computing units are connected via off-chip buses, which have limited bandwidth and high latency. 3D stacking architecture stacks DRAM layers on top of computing cores and uses TSVs for high-bandwidth access, but its design complexity is high, and the interweaving of multiple factors makes efficiency evaluation difficult.

## Methodology: Core Features of the Voxel Framework

Voxel is a fast, compiler-aware end-to-end simulation framework for exploring the efficiency of 3D stacked AI chips in LLM inference. Its uniqueness lies in its software-hardware co-exploration capability: it provides a programming interface that allows ML compilers to customize model execution plans, supports testing operator fusion, memory scheduling, and parallel configurations; and ensures reliable simulation results through cross-validation with real silicon simulators.

## Analysis: Multi-dimensional Factors Affecting the Efficiency of 3D Stacked Chips

Voxel analyzes efficiency-influencing factors from multiple dimensions:
- Computing paradigm: Different paradigms (e.g., weight stationary) show differences in 3D stacking environments; some paradigms that are average in traditional 2D may have advantages here.
- Mapping strategy: Unreasonable tile-to-core mapping leads to uneven core loads, and tensor-to-bank mapping affects access conflicts.
- Interconnection topology: Different NoC topologies (mesh, ring, etc.) and bandwidth configurations affect performance.
- Storage parameters: Trade-off between DRAM bank bandwidth and SRAM capacity.
- Power consumption and thermal constraints: High-density integration brings heat dissipation challenges, requiring a balance between performance and reliability.

## Conclusion: Key Findings for 3D Stacked Chip Design

Important conclusions from Voxel simulations:
- Collaborative optimization: End-to-end efficiency depends on the overall optimization of computing paradigms, mapping strategies, interconnection topologies, etc.; isolated optimization has little effect.
- Mapping strategy: Tile-to-core and tensor-to-bank mappings have a decisive impact on performance, with differences of several times under the same hardware configuration.
- Bandwidth-latency trade-off: There is a complex trade-off between memory bandwidth and latency; Voxel helps identify critical points to guide parameter configuration.

## Significance: Value of Voxel for AI Chip Design

The value of the Voxel framework includes:
- Reduce design risks: Explore the design space before tape-out to evaluate the pros and cons of schemes.
- Accelerate innovation iteration: Quickly simulate and test a large number of hypotheses to shorten the cycle.
- Open-source contribution: The team commits to open-sourcing the framework and results, providing a research foundation for academia and industry.

## Limitations and Future: Improvement Directions for Voxel

Limitations of Voxel: Balance between accuracy and speed (high-precision simulation takes a long time), continuous updates needed for supporting new architectures, and more real LLM workloads needed for validation. Future work includes supporting more complex 3D configurations, integrating more compiler optimizations, and tighter coupling with actual hardware.
