Zing Forum

Reading

Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models

A highly creative research project that uses a CUDA simulator to generate data, trains a diffusion model to learn game rules, and finally lets users play an arcade game via neural network "dreaming" in the browser—all using just a single GTX 970 graphics card.

世界模型扩散模型神经网络游戏CUDA生成式AIGTX 970
Published 2026-06-10 17:18Recent activity 2026-06-10 17:28Estimated read 8 min
Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models
1

Section 01

Introduction / Main Floor: Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models

A highly creative research project that uses a CUDA simulator to generate data, trains a diffusion model to learn game rules, and finally lets users play an arcade game via neural network "dreaming" in the browser—all using just a single GTX 970 graphics card.

2

Section 02

Original Author and Source

  • Original Author/Maintainer: Ali Kendir (alikendir0)
  • Source Platform: GitHub
  • Original Title: mirage
  • Original Link: https://github.com/alikendir0/mirage
  • Publication Time: 2026-06-10
  • License: MIT
3

Section 03

Project Overview

Mirage is an experimental project that integrates game engines with deep learning. Its core idea is: instead of running any game code, let a neural network learn game rules solely by observing game frames, then "dream" in the browser—generating game frames frame by frame, allowing players to control a game world entirely driven by neural networks.

The most striking aspect of this project is that all computations—including simulation, data generation, training, and interactive inference—are completed on a 2014 GTX 970 graphics card (4GB VRAM). Pre-trained weights are included in the repository, so users can experience it directly without training.

4

Section 04

Technical Architecture: Data Factory and World Model

CUDA Data Factory: The project first wrote a batch arcade game simulator using a custom CUDA engine. This simulator is not the final product but a data factory—it steps 512 game scenes simultaneously on the GPU, rasterizing outputs at approximately 206,000 frames per second. Training data is not a dataset that needs downloading; instead, it's a kernel function that generates infinite fresh data in real time.

World Model: A U-Net architecture with 2.6 million parameters, using cyclic padding (since the game arena is circular, the network's topological structure matches the world's) and FiLM (Feature-wise Linear Modulation) for action condition injection. The model only receives pixel inputs—the latest 4 frames and player keys—then predicts the next frame.

5

Section 05

Evolution of Two Versions

v1 — Deterministic Predictor: The training objective is to minimize average pixel error. It's almost perfect in single-step prediction (PSNR 42.6 dB), but in places where the future is uncertain (like which direction rock fragments will fly), the loss-optimal output is the average of all possible futures—resulting in "ghosting". Since the dream feeds on itself (the model uses its own output as input for the next step), ghosting accumulates like copies from a copier.

v2 — EDM Diffusion Model: Adopts the EDM framework by Karras et al. (2022), training denoising capabilities at each noise level. During gameplay, the model samples a clear frame from pure noise, guided by context and actions, promising a definite future instead of averaging all possibilities. Autoregressive PSNR improved from 25.6 dB to 27.3 dB at 10 steps.

6

Section 06

Performance Comparison Data

Metric v1 U-Net v2 Diffusion
Single-step PSNR 42.6 dB 42.2 dB
Autoregressive PSNR@10 steps 25.6 dB 27.3 dB
Dream Rule Test (Does shooting a large rock split it?) 24/24 18/24
Sharpness retention after 10 seconds of dreaming 0.985 1.004
Dream speed on GTX 970 200+ fps 54 fps (4-step sampling)

An honest finding is: In a mostly deterministic arcade game, the regression baseline is indeed strong—it suits near-single-peak futures and can master game rules. What the diffusion model brings is long-term commitment: while v1's dream has blurred and dissipated, v2's dream still looks like a game even after several minutes.

7

Section 07

Rigorous Engineering Practices

The simulator part of the project demonstrates extremely high engineering standards:

  • CPU Oracle Peer Verification: Each game rule is implemented in two ways—fast CUDA and readable numpy. Tests run both in parallel for 300 steps, with error ≤1e-4, completely consistent counters, and bit-level determinism across runs.
  • Stateless Random Number Generator: Each random number is a pure hash function of (seed, arena, step, call-site), fully consistent between CUDA and numpy.
  • Golden Image Test: The exact output of the rasterizer in fixed scenes is submitted as a benchmark; any unexpected pixel changes will cause CI failure.
  • 52 test cases, of which the CPU subset (oracle, data pipeline, model math) can run without a GPU.
8

Section 08

Three Interaction Modes

The project provides three browser interaction modes:

  • ?mode=real: Real game, inertial spaceship, splitting rocks, circular arena
  • ?mode=dream: The same game, but generated frame by frame via model hallucination
  • ?mode=both: Comparison mode—same input, shared opening history, observe how the two diverge.