Zing Forum

Reading

Deep Learning-Driven Surrogate Model for Nuclear Reactors: Application of Spatiotemporal Neural Networks in 3D Core Simulation

A deep learning surrogate model based on a hybrid spatiotemporal neural network architecture (ViT3D, Mamba) for simulating the flexible operation of 3D nuclear reactor cores, enabling efficient and accurate physical field prediction.

深度学习代理模型核反应堆时空神经网络ViT3DMamba堆芯模拟机器学习物理信息AI
Published 2026-05-16 21:55Recent activity 2026-05-16 22:00Estimated read 8 min
Deep Learning-Driven Surrogate Model for Nuclear Reactors: Application of Spatiotemporal Neural Networks in 3D Core Simulation
1

Section 01

[Introduction] Deep Learning-Driven Surrogate Model for Nuclear Reactors: Application of Spatiotemporal Neural Networks in 3D Core Simulation

Traditional 3D core physical field simulation of nuclear reactors relies on Monte Carlo or CFD methods, which are computationally expensive (taking hours to days), limiting applications in scenarios such as design optimization and real-time monitoring. This project uses a hybrid spatiotemporal neural network architecture (ViT3D+Mamba) to build a surrogate model, reducing inference time to seconds/milliseconds while maintaining accuracy, enabling efficient and accurate physical field prediction, and providing strong support for design, operation, safety analysis, etc., in the nuclear energy field.

2

Section 02

Project Background and Core Challenges

Nuclear reactor physical field simulation is a computational science challenge. Traditional methods have high accuracy but extremely high computational costs—single 3D core analysis takes hours or even days, severely restricting design optimization, real-time operation monitoring, and flexible strategy development. With the development of AI, surrogate models have become a solution: using deep learning to learn the input-output mapping of traditional simulators, significantly reducing inference time with acceptable accuracy.

3

Section 03

Technical Architecture: Hybrid Spatiotemporal Neural Network Design

This project adopts an innovative hybrid architecture combining the advantages of ViT3D and Mamba:

  1. Spatial Encoding (ViT3D) : Treat the 3D core as a volumetric image, capture spatial correlations of fuel assemblies via patch embedding and multi-head self-attention, suitable for handling local physical phenomena (e.g., power distortion caused by control rods);
  2. Temporal Modeling (Mamba) : Process long sequence dependencies with linear complexity, model dynamic processes like power changes and control rod movements, balancing long-range memory and low computational overhead;
  3. Residual Learning and Physical Constraints: Predict incremental state changes (delta prediction), aligning with physical intuition and improving training stability.
4

Section 04

Input-Output Design: Multiphysics Coupling Features

Input Dimensions: Cover multiphysics coupling features, including geometric and material information (3D fuel layout, enrichment, etc.), control rod status (position map and target position), thermal-hydraulic parameters (coolant inlet temperature/flow rate/pressure), power level (current and target setpoints). Output Physical Quantities: Core safety analysis indicators, including 3D power distribution, effective multiplication factor (keff), fuel temperature, and coolant density field.

5

Section 05

Three-Stage Progressive Training Strategy

Adopt phased training to ensure the model grasps physical laws:

  1. Spatial Encoder Pre-training: Independently train Encoder3D, using steady-state condition data for supervised learning to establish input-to-internal feature mapping;
  2. Spatiotemporal Processor Joint Training: Jointly train the pre-trained encoder with STProcessor and Decoder3D, introduce temporal data to learn dynamic evolution, covering normal operation to transient accident conditions;
  3. Single-Step Inference Optimization: Fine-tune to adapt to real-time inference scenarios, support predict_step mode, and introduce boundary caching to improve continuous inference efficiency.
6

Section 06

Technical Implementation Details and Engineering Practices

Rigorous engineering practices:

  • Configuration Management: Unify management of hyperparameters and geometric configurations via model_config;
  • Symmetry Handling: The halo_expand function handles core geometric symmetry (quarter/eighth core) to reduce data requirements;
  • Multi-Task Output: Parallel branches predict 3D physical fields and global scalars (keff) simultaneously;
  • Deployment Support: Reserve ONNX export interface for easy deployment on high-performance inference engines;
  • Dependency Management: Clarify TensorFlow 2.14 and Python 3.9-3.11 compatibility, provide CUDA/cuDNN configuration guidelines.
7

Section 07

Application Prospects and Industry Significance

Broad technical application value:

  • Design Optimization: Quickly evaluate hundreds of fuel loading schemes, shorten design cycles;
  • Operation Support: Real-time physical field prediction assists operators in optimizing control strategies;
  • Safety Analysis: Rapidly generate large numbers of condition samples for Probabilistic Safety Assessment (PSA);
  • Digital Twin: Act as a core component connecting real-time monitoring and physical simulation, enabling predictive maintenance and anomaly detection.
8

Section 08

Technical Limitations and Future Development Directions

Limitations: Accuracy is limited by the coverage of training data, and extrapolation ability for extreme conditions is insufficient; the black-box nature of neural networks leads to weak interpretability, which needs attention in nuclear safety applications. Future Directions:

  1. Physics-Informed Neural Networks (PINN): Integrate differential equation constraints to improve physical consistency;
  2. Uncertainty Quantification: Bayesian neural networks or ensemble methods to provide confidence intervals;
  3. Multi-Fidelity Fusion: Combine high-fidelity CFD with low-resolution surrogate models to achieve adaptive accuracy control.