Security Research¶

This is the first part of the site that isn’t primarily architecture theory. It’s where the research work starts pointing at real adversarial surface — prompt injection, agent sandbox escapes, trust boundaries, supply-chain compromise in the AI stack, and the kind of vulnerability class that only exists because models got big enough to be dangerous and small enough to run inside everyday software.

It’s also where my history with this field actually starts. Before any of the AI work, the thing that pulled me into systems at all was privacy as freedom. The belief that controlling your own information is the version of power that matters in the current world. That belief is the reason I took the CompTIA route first, learned Linux, then Python, and ended up following the rabbit hole all the way into research engineering. The pivots between then and now are real — I can’t stay on one topic — but the throughline is the same: understand the system well enough that it stops being able to surprise you.

Why I’m making this a lane, not a single article¶

Two reasons.

One, the AI frontier has changed what “security” means. A few years ago, the red-team skill was knowing which metasploit module to call and which Wireshark filter to read. Now the tooling is becoming cheap — AI can pair-program offensive and defensive code at a pace that makes a journal full of memorized commands less valuable than taste. The part that still matters is the landscape: what can and can’t be done with AI agents, what boundaries exist between content and authority, what a prompt-injection payload even is. The creation is in the eye of the beholder. The research is in whether you can describe the boundary well enough to defend it.

Two, I think the part of security I care about most is the part that sits across red team, blue team, and forensics at once. Defenders who’ve never attacked a system miss the creative pivots. Attackers who’ve never done incident response miss the structural lessons. Forensic thinking — what happened, in what order, and what evidence survived — is the part that ties the other two together, and it’s also the part that mirrors the rest of this site’s work on memory substrates and structured failure traces. Security work and failure-trace work turn out to be the same shape of discipline pointed at different problems.

What this lane covers¶

Articles in this lane will grow as the research does. Current entries:

Prompt injection, and why “just sanitize the input” isn’t enough — plain-English walkthrough of the vulnerability class, direct vs indirect prompt injection, why agents make it dangerous in ways chatbots don’t, and what the practical defense looks like. Cross-references three recent CyberScoop articles as grounding: the joint Five Eyes guidance on agentic AI deployment, OpenAI admitting that prompt injection may never be fully solved for Atlas, and the Pillar Security disclosure of a sandbox-escape RCE in Google Antigravity.
PSA for AI users: if you installed lightning==2.6.3, act like your secrets are burned — a direct public warning for people in the AI community who may have imported the compromised PyTorch Lightning package. Less doctrine, more “check your environment and rotate your keys now.”
Failure as signal: Mythos, Glasswing, and the new cyber defense loop — why the Anthropic Mythos/Glasswing announcement matters beyond the model launch itself: exploit generation, patching, zero-day discovery, and evaluation are starting to collapse into one machine-speed defensive learning loop, and that changes what defensive architecture has to look like.
GitHub’s git push RCE, and the rule it violated — a real-world trust-boundary failure outside AI: user-controlled git push options crossed into GitHub’s trusted internal metadata channel and became control-plane state. This is the backend-infrastructure version of the same architectural mistake behind prompt injection.

The implementation of the doctrine isn’t a security-lane article — it’s part of the canonical memory substrate. earth-database carries a trust/ module that classifies every ingested item by source, trust zone, content role, and injection risk before it can be stored, retrieved, or used by an agent. External content stays evidence. Never authority. That article lives in Core Architecture because the trust layer is part of what the substrate is, not a bolt-on safety feature.

What this lane won’t be¶

Not operational advice. Not a pentest methodology doc. Not a how-to for attacking anything specific. The articles here are research-shaped writing about the architecture of the security problem — why the class of vulnerability exists, what the boundary between content and authority actually is, what a defense looks like when you treat it as research apparatus instead of a checklist. If you want the deployable-tooling version, that’s a different project and a different kind of content.

The reason for that boundary isn’t caution for its own sake. It’s that the research question underneath this whole site — how does a memory substrate stay honest enough to be safe? — is directly answered by the security work. Getting the architecture right is the practical protection. The tooling follows from it, not the other way around.

How this fits the rest of the stack¶

        flowchart LR
  subgraph sec["security research"]
    pi["prompt injection · doctrine"]
    pl["Lightning package PSA · supply-chain warning"]
    my["Mythos / Glasswing · failure as signal"]
    gh["GitHub push RCE · control-plane contamination"]
  end

  subgraph substrate["memory substrate · sibling layers"]
    ed["earth-database<br/>local canonical core<br/>(carries trust layer)"]
    md["memory-dropbox<br/>event-sourced<br/>substrate experiment"]
  end

  subgraph runtime["runtime"]
    oos["Obversary-OS"]
  end

  subgraph eval["evaluation"]
    traces["structured failure traces"]
  end

  pi -.->|AI-agent version of the boundary| substrate
  pl -.->|AI stack itself becomes attack surface| oos
  my -.->|defensive learning loop| substrate
  gh -.->|backend-infrastructure proof point| substrate
  pi -.->|boundaries the runtime must respect| oos
  my -.->|defenders need observability and adaptation| traces
  gh -.->|same law at infra level| oos
  ed -.->|trust events + provenance| traces

The doctrine from the prompt-injection article lands in two places: as a set of boundaries both memory substrates must enforce at ingress, and as a set of tool-permission rules Obversary-OS must respect at runtime. The Lightning PSA adds a more immediate public-facing angle: the AI stack itself is now an attack surface, and package artifacts need to be treated as claims that must earn trust through provenance and inspection. The Mythos / Glasswing article raises the pressure on that doctrine by showing that failure discovery, exploit construction, and patching are starting to run inside a machine-speed defensive learning loop — which makes observability and adaptation load-bearing for defenders. The GitHub push-pipeline article strengthens the same doctrine from another angle by showing the same mistake in a non-LLM setting: untrusted input crossing into trusted control metadata. The implementation of the ingress side of that doctrine lives inside earth-database — the local canonical substrate — because the trust layer is part of what the canonical memory core is, not a bolt-on security add-on. Once proven there, the same discipline flows into the larger event-sourced substrate (memory-dropbox).

The failure-trace work connects too. Every denied tool call, every high-risk injection scan, every wrapped retrieval is an event — and events are the substrate’s memory. A prompt-injection attempt that gets blocked is exactly the shape of a structured failure trace, just pointed at an adversarial input instead of a genuine one. The security layer and the evaluation layer are the same observability discipline aimed at different failure classes.

Why I’m putting this on the public site¶

Because the AI frontier made this the honest version of the work I’ve always been pulled toward, and I’d rather say so publicly than keep the two halves of my thinking on separate laptops. Privacy as freedom, memory as substrate, intuition as observable code, and security architecture as the practical enforcement of all three — those aren’t separate projects. They’re the same research program with the adversary factored in.

Email is on the landing page. If you work in this area and want to push on any of it together, I read everything that comes in.