Zing 论坛

正文

Reptimeline:追踪神经网络训练中的表征演化

一款用于监控神经网络离散表征生命周期的开源工具,支持从稀疏自编码器、VQ-VAE到FSQ等多种后端,自动发现概念诞生、消亡、关系形成等关键事件,并提供因果验证与本体发现功能。

neural networkrepresentation learninginterpretabilitysparse autoencoderVQ-VAEcausal verificationontology discoverymachine learningAI explainability
发布时间 2026/05/04 11:44最近活动 2026/05/04 11:49预计阅读 5 分钟
Reptimeline:追踪神经网络训练中的表征演化
1

章节 01

Reptimeline: An Open-Source Tool for Tracking Neural Network Representation Evolution

Reptimeline is an open-source tool designed to monitor the lifecycle of discrete representations in neural networks. It supports multiple backends (sparse autoencoders, VQ-VAE, FSQ), automatically detects key events like concept birth, death, connection formation, and phase transitions, and provides causal verification and ontology discovery functions. It addresses the gap in dynamic analysis of neural representations during training.

2

章节 02

Project Background and Research Motivation

In deep learning, understanding what neural networks learn and how representations evolve during training is a core challenge in interpretability. Traditional methods only perform static analysis after training, failing to capture dynamic processes like concept clarification and association. Reptimeline is part of a larger research plan (third part of prime factorization neuro-symbolic AI and quaternion logic papers) aiming to build a framework for tracking discrete representation evolution across the full training lifecycle.

3

章节 03

Core Functionalities of Reptimeline

  1. Lifecycle Tracking: Identifies events like birth (first distinguishable concept), death (representation collapse), connection formation (concept associations), and phase transitions (training strategy shifts).
  2. Phase Transition Detection: Discovers key training转折点 via metric discontinuities.
  3. Bottom-up Ontology Discovery: Finds concept structures (duals, dependencies, 3-way interactions, hierarchy) without predefined primitives.
  4. Auto-labeling: Supports embedding-based, contrastive, and LLM-assisted annotation.
  5. Causal Verification: Offers intervention tests, Bootstrap CIs, permutation tests, and BH-FDR correction for multiple comparisons.
4

章节 04

Technical Architecture and Backend Support

Reptimeline is backend-agnostic, supporting multiple discretization schemes via a unified extractor interface.

  • Built-in Extractors: SAEExtractor (sparse autoencoder, Top-K binarization), VQVAEExtractor (VQ-VAE, codebook index to binary), FSQExtractor (finite scalar quantization, non-zero/one-hot).
  • Custom Extractors: Implement RepresentationExtractor interface (example code provided for extracting snapshots, defining similarity, and shared features).
5

章节 05

Validation Results and Case Studies

Validated on MNIST binary autoencoder (32-bit):

  • Decoder determinism: 100% (32-bit code fully determines output, n=100 tests)
  • Discovered dual pairs:65 groups
  • Dependencies:179
  • Phase transitions:0 (no significant shifts in simple task) Example pipelines include MNIST SAE, Pythia-70M SAE, and triadic bits experiments.
6

章节 06

Practical Value and Application Scenarios

Reptimeline fills an important gap in AI interpretability tools, applicable to both academia and industry:

  1. Model Debugging: Locate root causes of training anomalies or concept learning failures.
  2. Safety Audit: Verify if models learn expected concepts instead of spurious correlations.
  3. Knowledge Distillation: Identify core concept structures to guide efficient student model design.
  4. Continual Learning: Monitor new concept emergence and old concept forgetting to prevent catastrophic forgetting. It is a valuable tool for researchers and engineers seeking to understand neural network internal mechanisms.