Reading

Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models

A highly creative research project that uses a CUDA simulator to generate data, trains a diffusion model to learn game rules, and finally lets users play an arcade game via neural network "dreaming" in the browser—all using just a single GTX 970 graphics card.

世界模型扩散模型神经网络游戏CUDA生成式AIGTX 970

Published 2026-06-10 17:18Recent activity 2026-06-10 17:28Estimated read 8 min

Section 01

Introduction / Main Floor: Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models

Section 02

Original Author and Source

Original Author/Maintainer: Ali Kendir (alikendir0)
Source Platform: GitHub
Original Title: mirage
Original Link: https://github.com/alikendir0/mirage
Publication Time: 2026-06-10
License: MIT

Section 03

Project Overview

Mirage is an experimental project that integrates game engines with deep learning. Its core idea is: instead of running any game code, let a neural network learn game rules solely by observing game frames, then "dream" in the browser—generating game frames frame by frame, allowing players to control a game world entirely driven by neural networks.

The most striking aspect of this project is that all computations—including simulation, data generation, training, and interactive inference—are completed on a 2014 GTX 970 graphics card (4GB VRAM). Pre-trained weights are included in the repository, so users can experience it directly without training.

Section 04

Technical Architecture: Data Factory and World Model

CUDA Data Factory: The project first wrote a batch arcade game simulator using a custom CUDA engine. This simulator is not the final product but a data factory—it steps 512 game scenes simultaneously on the GPU, rasterizing outputs at approximately 206,000 frames per second. Training data is not a dataset that needs downloading; instead, it's a kernel function that generates infinite fresh data in real time.

World Model: A U-Net architecture with 2.6 million parameters, using cyclic padding (since the game arena is circular, the network's topological structure matches the world's) and FiLM (Feature-wise Linear Modulation) for action condition injection. The model only receives pixel inputs—the latest 4 frames and player keys—then predicts the next frame.

Section 05

Evolution of Two Versions

v1 — Deterministic Predictor: The training objective is to minimize average pixel error. It's almost perfect in single-step prediction (PSNR 42.6 dB), but in places where the future is uncertain (like which direction rock fragments will fly), the loss-optimal output is the average of all possible futures—resulting in "ghosting". Since the dream feeds on itself (the model uses its own output as input for the next step), ghosting accumulates like copies from a copier.

v2 — EDM Diffusion Model: Adopts the EDM framework by Karras et al. (2022), training denoising capabilities at each noise level. During gameplay, the model samples a clear frame from pure noise, guided by context and actions, promising a definite future instead of averaging all possibilities. Autoregressive PSNR improved from 25.6 dB to 27.3 dB at 10 steps.

Section 06

Performance Comparison Data

Metric	v1 U-Net	v2 Diffusion
Single-step PSNR	42.6 dB	42.2 dB
Autoregressive PSNR@10 steps	25.6 dB	27.3 dB
Dream Rule Test (Does shooting a large rock split it?)	24/24	18/24
Sharpness retention after 10 seconds of dreaming	0.985	1.004
Dream speed on GTX 970	200+ fps	54 fps (4-step sampling)

An honest finding is: In a mostly deterministic arcade game, the regression baseline is indeed strong—it suits near-single-peak futures and can master game rules. What the diffusion model brings is long-term commitment: while v1's dream has blurred and dissipated, v2's dream still looks like a game even after several minutes.

Section 07

Rigorous Engineering Practices

The simulator part of the project demonstrates extremely high engineering standards:

CPU Oracle Peer Verification: Each game rule is implemented in two ways—fast CUDA and readable numpy. Tests run both in parallel for 300 steps, with error ≤1e-4, completely consistent counters, and bit-level determinism across runs.
Stateless Random Number Generator: Each random number is a pure hash function of (seed, arena, step, call-site), fully consistent between CUDA and numpy.
Golden Image Test: The exact output of the rasterizer in fixed scenes is submitted as a benchmark; any unexpected pixel changes will cause CI failure.
52 test cases, of which the CPU subset (oracle, data pipeline, model math) can run without a GPU.

Section 08

Three Interaction Modes

The project provides three browser interaction modes:

?mode=real: Real game, inertial spaceship, splitting rocks, circular arena
?mode=dream: The same game, but generated frame by frame via model hallucination
?mode=both: Comparison mode—same input, shared opening history, observe how the two diverge.

Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models

Introduction / Main Floor: Mirage: A Game Running Inside Neural Networks — Practical Exploration of Diffusion World Models

Original Author and Source

Project Overview

Technical Architecture: Data Factory and World Model

Evolution of Two Versions

Performance Comparison Data

Rigorous Engineering Practices

Three Interaction Modes

Continue Reading

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

Graph Neural Networks Revolutionize Global Weather Forecasting: From Graph Weather to Open-Source Practice of Multi-Model Fusion

ExoVision: AI-Driven Exoplanet Detection and Habitability Assessment Platform

Vertica Expert Skills: A One-Stop Guide to Enterprise Database Migration and Optimization