# SystemVerilog Implementation of Neural Network Inference: Fixed-Point Quantization and Hardware Deployment Practice

> This project demonstrates how to convert a Python-trained neural network into a SystemVerilog hardware implementation using the Q3.12 fixed-point quantization format, achieving a test accuracy of 92.98% on the breast cancer classification task and providing a reproducible reference implementation for AI chip design.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-08T08:42:03.000Z
- 最近活动: 2026-06-08T08:54:16.069Z
- 热度: 141.8
- 关键词: SystemVerilog, 神经网络硬件化, 定点数量化, FPGA, AI芯片, 边缘计算, 数字电路设计, 模型部署
- 页面链接: https://www.zingnex.cn/en/forum/thread/systemverilog
- Canonical: https://www.zingnex.cn/forum/thread/systemverilog
- Markdown 来源: floors_fallback

---

## Project Introduction: Fixed-Point Practice for Neural Network Inference in SystemVerilog

## SystemVerilog Implementation of Neural Network Inference: Fixed-Point Quantization and Hardware Deployment Practice

This project was developed by Kiana Jafari, with source code hosted on GitHub ([link](https://github.com/Kiana-Jafari/SystemVerilog-ANN)) and released on June 8, 2026. The core content is converting a Python-trained neural network into a SystemVerilog hardware implementation using the Q3.12 fixed-point quantization format, achieving a test accuracy of 92.98% on the breast cancer classification task and providing a reproducible reference implementation for AI chip design.

## Background: Core Challenges in Neural Network Hardware Implementation

## Background: Engineering Challenges in Neural Network Hardware Implementation

The popularity of deep learning has driven the demand for deploying dedicated hardware (FPGA/ASIC), as their low power consumption and high throughput are suitable for edge computing scenarios. However, hardware implementation of floating-point operations (FP32) consumes significant resources, making quantization technology a key solution—converting floating-point weights/activations to fixed-point numbers to balance accuracy and complexity. This project demonstrates an end-to-end workflow from Python training to SystemVerilog implementation.

## Project Architecture and Network Design

## Project Architecture and Network Design

The project adopts a three-layer architecture:
1. **Data Directory**: Stores the Wisconsin Breast Cancer Dataset (569 samples, 30 features) and preprocessing scripts;
2. **Python Directory**: Trains a minimal 2-4-2 network (input: 2 neurons → hidden layer: 4 neurons (ReLU activation) → output: 2 neurons (Softmax for training, simplified to Argmax for inference));
3. **SystemVerilog Directory**: Core hardware implementation code.
The network input is reduced to 2-dimensional features via dimensionality reduction (e.g., PCA).

## Quantization Strategy: Analysis of Q3.12 Fixed-Point Format

## Quantization Strategy: Q3.12 Fixed-Point Format

The project uses Q3.12 fixed-point numbers: total width of 16 bits, 3 bits for the integer part (including sign bit, range from -4 to 3.9997), and 12 bits for the fractional part (precision ~0.00024). A post-training quantization strategy is adopted: first train the model in floating-point, then convert weights to fixed-point numbers to balance accuracy and resource overhead.

## Hardware Implementation and Development Workflow

## Hardware Implementation and Development Workflow

### Hardware Modules
- **Matrix Multiplication Unit**: Implements operations from input to hidden layer (2×4) and hidden layer to output layer (4×2);
- **Activation Function Module**: Lightweight implementation of ReLU (max(0,x)) and Argmax;
- **Data Path**: Handles fixed-point overflow issues;
- **Storage Architecture**: Weights stored in on-chip RAM/ROM or registers.

### Development Workflow
1. Floating-point model training; 2. Quantization calibration (determine scaling factors/zero points); 3. Quantized model validation; 4. SystemVerilog implementation; 5. Simulation validation (compare with Python results); 6. Synthesis and deployment.

## Application Value and Improvement Directions

## Application Value and Improvement Directions

### Value
- Reproducible reference for edge AI developers;
- End-to-end case for digital chip design learners;
- Modular starting point for AI chip design.

### Limitations
- Small network scale (only 16 weights);
- Simple quantization strategy (post-training quantization);
- Lack of complete verification environment description.

### Improvement Directions
- Extend to LeNet/small ResNet;
- Support convolutional layers;
- Adopt Quantization-Aware Training (QAT);
- Provide FPGA deployment tutorials and performance benchmarks.