microgpt

An educational web UI that lets you see inside a tiny transformer as it generates names, character by character.

Built on top of Andrej Karpathy's microgpt — a ~200-line pure-Python character-level GPT that implements everything from scratch (autograd, attention, training) with zero dependencies. See his blog post for more.

What it shows

The UI has two tabs:

Generation

Visualizes each step of name generation through 6 pipeline cards:

Tokenization — how characters map to token IDs (27 tokens: a-z + [START])
Embedding — token and position embeddings combined into a 16-number vector
Attention — heatmaps for each of the 4 attention heads showing which earlier characters the model focuses on
Combining — per-head outputs (4 groups of 4), blended via wo, then refined through an MLP (64 neurons) into a final 16-dim embedding
Projection — the final embedding is projected through lm_head into 27 raw logits (one per character), with the top 5 labeled
Prediction — probability bar chart over all possible next characters (softmax of logits)

Model Weights

Displays all 9 weight matrices (~4,192 parameters) as interactive colored heatmaps. Hover over any cell to see its exact value. Matrices are shown in forward-pass order — token/position embeddings, Q/K/V/O attention projections, MLP up/down layers, and the final language model head. Card colors match the corresponding Generation tab stages.

Setup

Requires Python 3.13+ and uv.

# Install dependencies
uv sync

# Train the model and save weights (~1 min)
uv run python save_weights.py

# Start the web UI
uv run python app.py

Then visit http://localhost:5001.

Docker (optional)

docker build -t microgpt .
docker run -p 5001:5001 microgpt

This uses gunicorn as a production server and doesn't require Python or uv on the host machine. Useful for deploying to a server.

Deploying to Fly.io

The repo includes a fly.toml config. CI/CD auto-deploy is not currently enabled, but can be set up with a GitHub Actions workflow:

To set up your own deployment:

Install the Fly CLI and sign up
Run fly launch (creates the app and sets the FLY_API_TOKEN GitHub secret)
Push to main — GitHub Actions handles the rest

To deploy manually: fly deploy

As of March 2026, running on Fly.io cost about $0.20/day with min_machines_running = 1 in fly.toml. This keeps one machine always warm for snappy responses — with 0, cold starts made it noticeably sluggish.

Usage

Generate a Name — generates a full name, animating one character at a time. Click any character to inspect that step's internals.
Step Through — generates one character at a time. Click "Next Character" to advance. Better for deep exploration.
Temperature slider — controls randomness. Low = predictable, high = creative. Recomputes probabilities client-side from raw logits (no server call).

How it works

microgpt.py is the original training script (untouched). save_weights.py runs training and serializes the ~4,192 learned parameters to weights.json. inference.py reimplements the forward pass with plain floats (no autograd) and captures intermediates at every layer. app.py serves a Flask API that the single-page frontend calls to generate names and retrieve visualization data.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
static		static
templates		templates
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
fly.toml		fly.toml
inference.py		inference.py
input.txt		input.txt
microgpt.py		microgpt.py
pyproject.toml		pyproject.toml
save_weights.py		save_weights.py
uv.lock		uv.lock
weights.json		weights.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

microgpt

What it shows

Generation

Model Weights

Setup

Docker (optional)

Deploying to Fly.io

Usage

How it works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

microgpt

What it shows

Generation

Model Weights

Setup

Docker (optional)

Deploying to Fly.io

Usage

How it works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages