References and Influences¶

Papers, repos, and books that shaped what I’m building. Not a complete bibliography — a working list of things I keep coming back to. Each entry gets a short note on what it actually changed in my thinking, because a reference list without that is just a wall of citations.

Papers¶

Predicting Empirical AI Research Outcomes with Language Models¶

Authors: Jiaxin Wen, Chenglei Si, Chen Yueh-Han, He He, Shi Feng (2025)
Link: arxiv.org/abs/2506.00794
Related work on this site: Memory-guided evaluation, Failure-induced benchmarks, Why memory is the substrate

This is the one that lines up most directly with what I believe. They build a system that predicts which of two research ideas will work better, and a fine-tuned model beats expert NLP researchers on the task. The mechanism that makes that possible is, basically, accumulated experience over thousands of past papers — the same shape of memory I’m arguing for in the substrate. They’re saying out loud what I’ve been trying to say with a system: research intuition is something you can build, and you build it by letting a model accumulate the right kind of experience and then act on it. That’s the bet. Their evidence helps me say it with less hedging.

MemEvolve: Meta-Evolution of Agent Memory Systems¶

Authors: Guibin Zhang, Haotian Ren, Chong Zhan, Zhenhong Zhou, et al. (OPPO AI Agent Team, LV-NUS Lab, 2025)
Link: arxiv.org/abs/2512.18746
Related work on this site: Memory Dropbox, Why memory is the substrate

The argument I keep making — that memory architecture is itself a thing that should evolve, not a thing you hand-engineer once and freeze — is exactly what this paper goes after. They split memory into encode / store / retrieve / manage and let the architecture itself meta-adapt to the task. That four-piece split is close to what I have in memory-dropbox, and the meta-adaptation idea is where I want the substrate to go next. This paper is also a useful reminder that “memory system” is plural — there are at least twelve representative ones already in the literature — and that the design space is real, not a single canonical answer.

MetaGPT: Meta Programming for a Multi-Agent Collaborative Framework¶

Authors: Sirui Hong, Mingchen Zhuge, Jiaqi Chen, et al. (DeepWisdom, KAUST, 2023)
Link: arxiv.org/abs/2308.00352
Related work on this site: Obversary-OS

The thing I took from MetaGPT isn’t the SOP-as-prompts trick. It’s the discipline of giving each agent a structured output contract instead of letting them chat in unconstrained natural language. Most multi-agent setups die in the telephone-game step where one agent garbles the previous agent’s output. Forcing every handoff to be a structured artifact (a PRD, a design doc, a file list) is the same instinct that makes me want every memory event in the substrate to have a schema. It’s not glamorous. It’s what stops the system from drifting into nonsense.

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering¶

Authors: Jun Shern Chan, Neil Chowdhury, Oliver Jaffe, James Aung, et al. (OpenAI, 2024)
Link: arxiv.org/abs/2410.07095
Related work on this site: Failure-induced benchmarks, Failure-sliced eval, Evaluation Systems

The reason I keep coming back to this one is the choice of what to measure. They didn’t build another QA benchmark. They built a benchmark of real ML engineering work — Kaggle competitions, end to end — and then made agents try to do them. That’s the right shape of evaluation: pick the actual work the system is supposed to do, and measure whether it can do it without the human having to babysit. It’s a good model for how I want failure-induced benchmarks to grow. Less leaderboard, more can the system actually finish the job.

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model¶

Authors: BigScience Workshop (Le Scao, Fan, Akiki, Pavlick, et al., 2022)
Link: arxiv.org/abs/2211.05100
Related work on this site: Research Engineering Map

I keep this paper around for two reasons. One is the raw fact of it — hundreds of researchers, public funding, the model and the dataset and the process all out in the open. That’s the version of this field I want to be building toward. The other is the engineering of the thing: every pipeline decision is documented, every trade-off is named, the data curation has its own subsection, the bias evaluation has its own subsection. It’s the closest thing to a research-engineering reference I’ve read for “how do you actually coordinate the pieces of a model release without losing the thread.” When I’m writing about modular ingestion and provenance, this is one of the papers in the back of my head.

Security research (public reporting)¶

Public reporting and disclosed incidents are the grounding behind the security-research lane on this site (Security Research, Prompt injection, GitHub’s git push RCE, PyTorch Lightning package compromise PSA, Failure as Signal: Mythos, Glasswing, and the New Cyber Defense Loop, earth-database). These pieces are useful because they make the trust-boundary problem concrete in different environments: browser agents, coding agents, backend infrastructure, package ecosystems, deployment policy, and the model-driven vulnerability-discovery loop.

GitHub’s git push pipeline RCE (CVE-2026-3854)¶

Publications: GitHub Security Blog, Wiz Research, SecurityWeek, April 2026
Links: github.blog/security/securing-the-git-push-pipeline-responding-to-a-critical-remote-code-execution-vulnerability, wiz.io/blog/github-rce-vulnerability-cve-2026-3854, securityweek.com/critical-github-vulnerability-exposed-millions-of-repositories
Why it matters: This is the clean non-AI analogue for the whole prompt-injection doctrine on the site. User-controlled git push options crossed into trusted internal metadata, downstream services treated the injected fields as authority, and the result was a path to remote code execution. It’s the same law as prompt injection in a different parser: untrusted content crossed from the data plane into the control plane.

PyTorch Lightning 2.6.3 package compromise¶

Publications: Lightning AI security advisory / issue report, BleepingComputer, May 2026
Links: github.com/Lightning-AI/pytorch-lightning/issues/21689, bleepingcomputer.com/news/security/backdoored-pytorch-lightning-package-drops-credential-stealer
Why it matters: This is an AI-community PSA as much as a security story. A malicious lightning==2.6.3 wheel reportedly executed on import, downloaded an obfuscated payload, and targeted cloud credentials, browser data, .env files, and GitHub tokens. The important lesson is simple and ugly: the AI stack is now an attack surface, and import is a trust boundary.

Claude Mythos Preview, Project Glasswing, and AI-powered cyber defense¶

Publications: Anthropic Frontier Red Team blog, Anthropic Project Glasswing announcement, Cisco blog, Google Threat Intelligence / Mandiant, April 2026 and October 2025
Links: red.anthropic.com/2026/mythos-preview, anthropic.com/glasswing, blogs.cisco.com/news/rising-to-the-era-of-ai-powered-cyber-defense, cloud.google.com/blog/topics/threat-intelligence/oracle-ebusiness-suite-zero-day-exploitation, watchTowr Labs — Oracle E-Business Suite pre-auth RCE chain
Why it matters: This is the first public cluster of writing that makes the next step explicit: models are no longer merely helping with code generation or triage, they are being used to discover zero-days, turn bugs into exploits, validate severity, and feed the results back into coordinated defensive programs. For me, that makes failure traces, provenance, and “failure as the second memory” feel much less like abstraction and much more like the shape the field is being forced into.

OpenAI says prompt injection may never be “solved” for browser agents like Atlas¶

Publication: CyberScoop, May 2026
Link: cyberscoop.com/openai-chatgpt-atlas-prompt-injection-browser-agent-security-update-head-of-preparedness
Why it matters: OpenAI publicly admitting that prompt injection may never be fully mitigated for browser agents is the honest version of the industry status on this vulnerability class. Their automated red-team attacker was specifically designed to chase multi-step harmful workflows rather than single misbehaviors, and the demo — a malicious email tricking Atlas into sending a resignation letter instead of the requested out-of-office reply — is exactly the reframe I use on the prompt-injection article: content that would try to persuade a person now tries to command an agent that’s already been empowered to act.

Vuln in Google’s Antigravity AI agent manager (Pillar Security disclosure)¶

Publication: CyberScoop, May 2026
Link: cyberscoop.com/google-antigravity-pillar-security-agent-sandbox-escape-remote-code-execution
Why it matters: The canonical example of a tool-boundary failure. The find_by_name tool was classified as native, which meant Antigravity’s Secure Mode — designed to sandbox commands, throttle network, and prohibit writes outside the working directory — never got a chance to see the dangerous path before execution. Pillar researcher Dan Lisichkin’s quote is the line that belongs on every agent-architecture wall: “The security boundary that Secure Mode enforces simply never sees this call.” This is the article that made the case for putting the trust/ module in earth-database at ingress, not at execution.

CISA, NSA, and the Five Eyes joint guidance on agentic AI deployment¶

Publication: CyberScoop, May 2026
Link: cyberscoop.com/cisa-nsa-five-eyes-guidance-secure-deployment-ai-agents
Why it matters: The five risk categories (privilege, design/configuration, behavioral, structural, accountability) line up almost perfectly with the doctrine in my prompt-injection article — and the guidance’s central move, fold agentic AI into existing cybersecurity frameworks (zero trust, defense-in-depth, least privilege), don’t invent a parallel discipline, is the framing I want the whole lane to inherit. The sentence I keep coming back to: “Organisations should assume that agentic AI systems may behave unexpectedly and plan deployments accordingly, prioritising resilience, reversibility and risk containment over efficiency gains.” That’s the load-bearing quote for why any of this research has to happen now, in the open.

Repositories I keep open¶

These aren’t listed under formal citation, but they’re as influential as any paper. Each one is a working artifact I either use, fork from, or watch closely.

obversary/memory-dropbox — my own substrate. Linked here because the substrate question is the spine of the rest of the work.
obversary/earth-database — the low-latency embedded sibling of memory-dropbox, and the first place the trust-aware memory ingress layer gets built. Cited from the security research lane.
obversary/Obversary-OS — runtime layer above the substrate.
obversary/pdf-intelligence-core — first applied ingestion lane (PDFs).
obversary/memoryevalguided — failure trace schema and evaluation harness.
obversary/failure-induced-benchmarks — turning traces into harder questions.

A note on what’s missing¶

This list is intentionally short. I’d rather have five entries I can talk about than fifty I’m name-dropping. As I write more articles, more references will land here. If you’re reading this and you think there’s a paper I should be sitting with, my email is on the landing page — I read everything that comes in.