Zing 论坛

正文

hwLedger:面向LLM部署的容量规划与异构集群管理工具

hwLedger 是一个Apache-2.0许可的桌面应用,专注于解决LLM部署中的VRAM规划、异构设备管理和本地推理运行问题,支持多种注意力架构的精确计算。

LLM部署容量规划VRAM计算异构集群Apple SiliconMoEMLA开源工具
发布时间 2026/04/19 17:34最近活动 2026/04/19 17:54预计阅读 7 分钟
hwLedger:面向LLM部署的容量规划与异构集群管理工具
1

章节 01

hwLedger: Open-Source Tool for LLM Deployment Capacity Planning & Heterogeneous Cluster Management

hwLedger is an Apache-2.0 licensed desktop application + Agent/server combination, positioned as an LLM infrastructure management tool with 'hobbyist scale, enterprise-grade architecture'. It addresses key pain points in LLM deployment: accurate VRAM calculation for modern architectures (like MoE, MLA) and unified management of heterogeneous device clusters. Core capabilities include architecture-aware capacity planning, real-time telemetry validation, local inference (Apple Silicon optimized), and cross-device cluster management.

2

章节 02

Challenges in LLM Deployment Addressed by hwLedger

LLM deployment faces two main challenges:

  1. Inaccurate VRAM Calculation: Existing tools (HF Accelerate, can-it-run-llm) struggle with modern architectures—confusing MoE's resident vs activation parameters, underestimating MLA's KV Cache, and mishandling GQA's grouping logic.
  2. Heterogeneous Cluster Management: Managing distributed devices (local NVIDIA/AMD workstations, Apple Silicon laptops, cloud instances like Vast.ai) lacks unified tools for scheduling and cost optimization. hwLedger aims to fill these gaps.
3

章节 03

Layered Architecture & Architecture-Aware Capacity Calculation

Layered Architecture:

  • Core Layer: Rust-based (hwledger-core, arch, ingest, probe, etc.) for performance and reliability.
  • Sidecar Layer: Forked oMlx for optimized local inference on Apple Silicon.
  • Native App Layer: Platform-specific UIs (SwiftUI for macOS, WinUI3 for Windows, Qt/Slint for Linux).
  • Cluster Communication: Axum (mTLS for agents), russh (SSH for non-agent devices), cloud APIs (reqwest), Tailscale (local network discovery).

Core Innovation: Architecture-aware math core uses dedicated formulas for each AttentionKind (MHA/GQA/MQA/MLA/Sliding Window/SSM/Hybrid/Sink), distinguishing resident vs activation parameters for precise VRAM calculation.

4

章节 04

Key Capabilities of hwLedger

  1. VRAM & Throughput Planning: Architecture-aware formulas for accurate calculation of model weights, KV Cache, activations, and system overhead.
  2. Real-Time Telemetry: Compares predicted resource needs with actual data from engines like MLX, mistral.rs, llama.cpp, vLLM, TGI.
  3. Local Inference: On Apple Silicon, uses oMlx sidecar with SSD-paged KV Cache to extend context length.
  4. Heterogeneous Cluster Management: Unifies local/cloud devices with event-sourced audit logs, scheduling planners, and spot price-aware cost models.
5

章节 05

Application Scenarios for hwLedger

  • Individual Developers: Choose model quantization levels, determine max context length, evaluate inference engine efficiency.
  • Small Teams: Get unified device resource views, optimize model deployment scheduling, track costs.
  • Edge Deployment: Assess hardware feasibility for LLM runs, optimize configurations to fit edge device limits.
6

章节 06

Open-Source Significance of hwLedger

hwLedger contributes to the LLM community as:

  1. Accurate Capacity Tool: Fills gaps in MoE/MLA support for existing calculators.
  2. Cross-Platform Reference: Rust core + native UI pattern for multi-platform tools.
  3. Cluster Management Guide: Event溯源, cost models, and scheduling logic for distributed LLM deployment.
  4. Apple Silicon Optimization: Specialized support for M-series chips.
7

章节 07

Development Roadmap of hwLedger

hwLedger follows a phased plan:

Phase Content Status
P0 Basic infrastructure In progress
P1 Math core (capacity calculation) Planned
P2 Config parsing + telemetry Planned
P3 macOS GUI MVP Planned
P4 Inference (macOS) Planned
P5 Cluster management Planned
P6 Windows GUI Delayed
P7 Linux GUI Delayed

Current focus: WP21 (macOS release) including code signing, GitHub Actions workflow, DMG packaging, and Sparkle auto-updates.

8

章节 08

Conclusion: hwLedger's Potential in LLM Infrastructure

hwLedger addresses critical pain points in LLM deployment with its architecture-aware capacity planning, heterogeneous cluster management, and local inference capabilities. Its open-source nature and technical depth make it a valuable tool for developers and teams. As development progresses, it is poised to become an important reference in the LLM infrastructure space.