Zing Forum

Reading

ATLAS: The First Full-Stack Performance Evaluation Framework for 3D-DRAM Large Language Model Accelerators

This article introduces the ATLAS framework, the first silicon-validated simulation framework for 3D-DRAM large language model accelerators. It provides researchers with an open full-stack performance analysis tool, filling the gap in the field where public evaluation methods were lacking.

3D-DRAM大语言模型加速器性能评估ATLAS框架内存瓶颈混合键合技术设计空间探索全栈仿真
Published 2026-04-09 17:48Recent activity 2026-04-10 10:14Estimated read 5 min
ATLAS: The First Full-Stack Performance Evaluation Framework for 3D-DRAM Large Language Model Accelerators
1

Section 01

[Introduction] ATLAS Framework: The First Silicon-Validated Full-Stack Evaluation Tool for 3D-DRAM LLM Accelerators

ATLAS is the first full-stack simulation framework for 3D-DRAM large language model accelerators validated with real silicon, filling the gap in the field where public performance evaluation tools were missing. Built on commercial 3D-DRAM technology, it provides an open, universal, and high-precision performance analysis platform that supports any inference scenario, helping researchers conduct design space exploration and promoting the development and ecosystem formation of 3D-DRAM accelerator technology.

2

Section 02

Background: Memory Bottlenecks in Large Model Inference and Limitations of Existing Evaluation Tools

Large language model inference (especially the decoding phase) is memory-intensive, making bandwidth a key bottleneck; 3D-DRAM has become an ideal choice due to its high bandwidth density and energy efficiency ratio. However, current 3D-DRAM accelerators rely on closed-source evaluation tools, leading to fragmented modeling and results that are difficult to compare, which hinders technological progress.

3

Section 03

Core Design of the ATLAS Framework: Unified Abstraction and Real Silicon-Based Foundation

ATLAS is built based on the characteristics of commercialized 3D-DRAM silicon chips and introduces a unified abstraction mechanism: at the system architecture level, it defines standardized component interfaces and interconnection models; at the programming primitive level, it provides general computing and storage operation abstractions, shielding hardware differences and supporting scenarios such as LLMs of different scales, single-user low-latency, and high-throughput batch processing.

4

Section 04

Evidence: Silicon Validation Accuracy and Design Space Insights

ATLAS has been validated with silicon, with a simulation error ≤8.57% and a correlation coefficient with measured performance ranging from 97.26% to 99.96%. Design space exploration reveals that there is an optimal range for the ratio of memory bandwidth to computing units, and different batch sizes require adjusting the 3D-DRAM hierarchical scheduling strategy to leverage the high-bandwidth advantage.

5

Section 05

Open Ecosystem: Open-Source Plan and Domain Development Recommendations

The research team will open-source the ATLAS framework to break closed-source barriers and allow more researchers to participate; iteratively improve functions through community efforts; establish unified evaluation benchmarks to promote fair competition and cooperation, and drive the maturity of the field.

6

Section 06

Conclusion: ATLAS Reshapes the Research Paradigm of 3D-DRAM LLM Accelerators

ATLAS marks a new stage in the research of 3D-DRAM LLM accelerators—from relying on closed-source tools to an open platform, from fragmented modeling to unified abstraction, from speculative design to data-driven optimization. It will promote the technology to find a better balance among performance, energy efficiency, and cost, paving the way for the inclusive application of LLMs.