Zing Forum

Reading

SASNet: Spatial Adaptive Sinusoidal Neural Network for Implicit Neural Representations

SASNet is an innovative spatial adaptive sinusoidal neural network that achieves high-quality implicit neural representations by dynamically adjusting the frequency of sinusoidal activation, and performs excellently in tasks such as image reconstruction and 3D scene representation.

隐式神经表示INR正弦神经网络SIREN图像重建3D表示神经辐射场计算机视觉
Published 2026-05-18 05:41Recent activity 2026-05-18 05:50Estimated read 7 min
SASNet: Spatial Adaptive Sinusoidal Neural Network for Implicit Neural Representations
1

Section 01

[Introduction] SASNet: Innovative Application of Spatial Adaptive Sinusoidal Neural Network in Implicit Neural Representations

SASNet is a spatial adaptive sinusoidal neural network. Addressing the limitation of fixed activation frequency in traditional methods for Implicit Neural Representations (INR), it achieves high-quality representation by dynamically adjusting the frequency of sinusoidal activation. It performs excellently in tasks like image reconstruction and 3D scene representation, with higher representation efficiency, detail preservation capability, convergence speed, and cross-scale consistency.

2

Section 02

Research Background: Potential of INR and Limitations of Traditional Methods

Implicit Neural Representation (INR) uses continuous neural network functions to represent signals such as images and 3D shapes. It has advantages like resolution independence, memory efficiency, and differentiability, and shows great potential in tasks like super-resolution and 3D reconstruction. However, traditional INR methods use fixed activation functions (e.g., ReLU) or fixed-frequency sinusoidal activation (e.g., SIREN), which struggle to adapt to the signal characteristics of different spatial positions—high-frequency regions need high frequencies to capture details, while smooth regions need low frequencies, and fixed frequencies cannot balance both.

3

Section 03

Core Method: Spatial Adaptive Frequency Modulation Mechanism

The core of SASNet is spatial adaptive frequency modulation:

  1. Frequency parameters become functions of spatial positions: The standard SIREN uses fixed frequency sin(ω·x), while SASNet uses sin(ω(x)·x), where ω(x) is predicted by a small sub-network to achieve position-customized frequency (increase frequency in high-frequency regions, decrease in smooth regions).
  2. The architecture consists of two parts: the main INR network (maps spatial coordinates to signal values) and the frequency prediction network (inputs coordinates and outputs frequency parameters), which are trained collaboratively to improve fitting capability.
4

Section 04

Technical Advantages: Efficiency, Detail, Convergence, and Cross-scale Consistency

  1. Higher representation efficiency: Achieves comparable or better quality with fewer parameters; in image reconstruction, smaller network capacity leads to lower error.
  2. Better detail preservation: Solves the dilemma of fixed frequency (high-frequency noise/low-frequency detail loss), keeping smooth regions clean while preserving fine details.
  3. Faster convergence: Adaptive frequency provides a more flexible optimization space, requiring fewer iterations.
  4. Cross-scale consistency: Maintains appropriate levels of detail when querying at any resolution, avoiding over-smoothing or artifacts.
5

Section 05

Application Scenarios: Image, 3D, Video, and SDF Learning

  1. Image representation and compression: Encoded into compact weights, with better visual quality at extremely low bit rates.
  2. 3D scene representation and novel view synthesis: Suitable for tasks like NeRF, handling regions with violent or gentle geometric changes.
  3. Video representation: The time-extended version can adaptively modulate spatial and temporal dimensions for efficient compression.
  4. SDF learning: Accurately captures surface details of 3D objects while maintaining smoothness in regions far from the surface.
6

Section 06

Experimental Results: Excellent Performance on Multi-task Benchmark Datasets

It performs outstandingly on multiple benchmark datasets:

  • Image reconstruction: On Kodak and DIV2K datasets, the reconstruction error is lower than methods like SIREN and Fourier features under the same network capacity.
  • 3D representation: More accurately recovers geometric details of object surfaces on the ShapeNet dataset.
  • Training efficiency: Requires fewer iterations than baseline methods to reach target accuracy.
7

Section 07

Implementation and Outlook: Open-source Code and Future Research Directions

Implementation: The open-source code provides complete architecture definitions, task examples, pre-trained models, training scripts, and documentation, with clear interfaces for easy integration. Outlook:

  • Extend to complex architectures like Transformer-based INR;
  • Explore applications in other modalities such as audio and sensor data;
  • Combine with NAS to automatically discover optimal frequency modulation strategies;
  • Develop more efficient frequency prediction networks to reduce computational overhead. SASNet provides new ideas for the development of INR and is expected to have a wide impact in fields like image processing and 3D vision.