RL Agent Skeleton¶
A short hub for the agent / RL teaching pages. If you only have ten minutes, this is the route I’d recommend:
Blank RL Agent Template — minimal
Agentclass and the First runnable system (Gymnasium loop). The shape of an agent before any libraries do the showing.learn()progression — same page, Evolvinglearn(): memory tally → tabular Q update → neural batch sketch. Same hook, three different bodies.PyTorch DQN Agent Walkthrough — minimal DQN-style
learn()with PyTorch in full context.
The same conceptual categories show up in larger RL systems — environment interaction, policy, memory, exploration, learning. These pages are minimal teaching examples, not a full training stack. The point isn’t to compete on benchmark scores. The point is to make the slots visible before the implementations get clever, so when a serious RL system breaks later, you can still tell which slot broke.
Who this is for¶
You’ve run a Gymnasium (CartPole-style) loop before and you’re okay reading Python. You don’t need a production RL framework — these pages deliberately stay small. If you’re only wiring an API-backed “agent,” the shape here still helps: the same slots (what is observed, what is stored, what is updated, what is executed) reappear in tool-using systems, just with different implementations.
Pick a page by goal¶
Goal |
Start here |
|---|---|
See the smallest |
|
Same ideas with a real |
|
Map this teaching stack to memory, evaluation, and runtime work on the site |
After the skeleton¶
Once the slots are visible, the debugging question stops being “the network is bad” and becomes which slot diverged — bad observations, stale memory, wrong exploration pressure, or a learning update that doesn’t match the policy you thought you trained. That’s the same observability instinct behind structured failure traces and the wider evaluation lane, just applied to RL-shaped systems.
Companion code direction¶
If you want the runnable code separately, a minimal RL agent repository would be laid out roughly like:
src/minimal_rl_agent/
scripts/train_cartpole.py
scripts/evaluate_cartpole.py
The website you’re reading is the documentation layer (the public /docs/ site). Runnable RL code belongs in a separate codebase — training scripts and a real package layout — not inside this website repo. I keep them apart on purpose so the deployed site stays small and the code story stays versioned where it actually runs.