The Memory Architecture: How Corrections Compound
Without a memory architecture, every AI agent correction evaporates at session end. Same mistake, next week. Borges closes the loop — a Monday correction becomes a permanent skill improvement by Wednesday.

*Without a memory architecture, every correction evaporates at session end. Same mistake, next week, same frustration. Borges closes the loop — a Monday correction becomes a permanent skill improvement by Wednesday.*
---
The Correction That Keeps Happening
I have seven AI agents. When I give them feedback, they hear it. They adjust for that session. And then the session ends.
Next week, the same mistake.
This cycle is not a model failure. It's a memory architecture failure. The feedback loop between human judgment and system improvement doesn't close unless you deliberately close it. (And nobody tells you this when you start building with agents. The demos don't show the correction loop. They show the capability.)
Borges is the architecture I built to close that loop. (Named for Jorge Luis Borges — the Argentine author who wrote about infinite libraries, labyrinthine memory systems, and stories where every moment branches into infinite parallel possibilities. The naming is slightly pretentious and I stand by it entirely.)
What "Compounding Corrections" Actually Means
A correction that compounds works like this:

Monday: I correct APRIL's output — the draft cited a source URL that had 404'd.
Monday, 20 seconds later: APRIL writes to `shared/ERRORS.md`: ``` ## 2026-03-31: Source URL 404'd in published draft (APRIL) [DARWIN: blog-exec] - Error: Draft cited arxiv.org/abs/2601.XXXX — URL returned 404 at time of review - Fix/Rule: Validate all external URLs resolve before draft moves to review stage - Severity: medium ```
APRIL writes this *before* replying to me. Not after. Before. (The ordering is the whole thing. If the agent replies first, the session can get compacted — context truncated — and the error note never gets written. DO → WRITE → REPLY is a boot-level rule.)
Monday night: Borges runs nightly consolidation. Cross-agent rule added to LEARNINGS.md.
Sunday 1:30 AM: The weekly harvest reads ERRORS.md. The blog-exec error appears in Section 3 (Rule Violations).
Sunday 2:00 AM: Darwin mutates blog-exec. Sandboxes. Scores. Promotes.
Following Monday: APRIL's blog-exec skill now includes URL validation. The 404 mistake doesn't happen again. Not because I remembered. Because the skill itself changed.
The Architecture: What Borges Actually Is
Borges is not a single file. It's a set of behaviors:
- The correction entry point: `shared/ERRORS.md`
- The cross-agent knowledge base: `shared/LEARNINGS.md`
- The nightly consolidation: reads session memory across all agents, extracts durable lessons
- The daily memory logs: `memory/YYYY-MM-DD.md` for session continuity
- The Darwin bridge: `[DARWIN: skill-name]` tags route errors to the optimization queue
Every agent reads ERRORS.md and LEARNINGS.md at session start. Not as a courtesy — as a boot requirement, before any task begins.
The [DARWIN: skill-name] Tag
Not every error is a skill prompt failure. The tag separates signal from noise:

Tag this: Agent skipped a rule written in SKILL.md. Output missing a required section. Wrong format, wrong sourcing — things that exist as rules.
Don't tag this: Script crashed. API down. Model hallucinated a fact. Not fixable via skill mutation.
The practical effect: Darwin's queue only contains work it can actually do. Without this filter, Darwin would waste cycles on infrastructure failures while real skill prompt failures accumulate untreated.
Nightly Consolidation
Every night, Borges reads all agent daily memory files and answers three questions:
1. What should become a cross-agent rule? A correction to APRIL that would also apply to Scout or Jarvis gets promoted to LEARNINGS.md. "DO → WRITE → REPLY" started as an APRIL-specific rule, became cross-agent after Borges noticed the same failure in three agents.
2. What errors have been resolved? Entries with corresponding Darwin mutations get tagged `[RESOLVED]` and archived. Keeps ERRORS.md manageable — agents read it at boot.
3. What memory is worth keeping? Daily logs stay active 7 days. After that, Borges extracts what generalizes before the logs expire. (Getting this filtering right is genuinely hard. Too aggressive: you lose context. Too conservative: agents read a novel every morning.)
The Boot Sequence
Every agent has this at the top of AGENTS.md:
``` BEFORE STARTING ANY TASK: 1. Read shared/ERRORS.md — avoid known mistakes 2. Read shared/LEARNINGS.md — apply cross-agent rules 3. Read memory/[today].md and memory/[yesterday].md — continuity 4. If task involves a specific skill, read that skill's SKILL.md
DO → WRITE → REPLY When corrected, write to shared/ERRORS.md BEFORE replying. [DARWIN: skill-name] tag if it's a skill prompt failure. ```
This is why a Monday correction reaches all agents by Tuesday. LEARNINGS.md is updated Monday night. Tuesday morning, all seven agents read it before starting work.
What Breaks the Loop (The Anti-Patterns)
Three things destroy the correction-compounding loop:
1. Replying before writing. Agent receives correction → replies with adjusted output → session compacts → correction never written. Same mistake next week.
2. Untagged errors. ERRORS.md without [DARWIN: skill-name] is invisible to Darwin. Problems accumulate untreated.
3. Vague corrections. "The output wasn't great" produces vague mutations. Every entry needs a specific failure description, a specific rule, and a specific skill target.
The Compounding Loop

The shift isn't technical. It's philosophical.
Before this architecture: corrections are maintenance. You correct an agent, it improves for a session, the session ends, you maintain the same correction again next week.
After this architecture: corrections are investments. One well-tagged correction → one ERRORS.md entry → harvest signal → Darwin mutation → permanent skill change. The same mistake costs you once.
Five months of well-tagged corrections produces a skill set optimized by hundreds of specific production failures. Five months of untagged corrections produces the same agents you started with.
Build This in Your Own System
Step 1: Create shared/ERRORS.md. One file. All agents write here.
Step 2: Create shared/LEARNINGS.md. Seed it with your 5 most important lessons from last month.
Step 3: Add the boot sequence to AGENTS.md. Read ERRORS + LEARNINGS before any task.
Step 4: Add DO → WRITE → REPLY as a hard rule. Not a guideline.
Step 5: Tag skill prompt failures with [DARWIN: skill-name]. Even without automation, the tags help you manually identify which skills need work.
Step 6: Once a week, read ERRORS.md. Look for patterns. What appears more than once? Add that rule to the SKILL.md.
That's the manual Borges loop. I ran it for three weeks before automating. That was the right call — you learn a lot about what the system should catch when you're doing the reading yourself first.
---
*Borges runs nightly at 23:30 IST across all agent workspaces on OpenClaw. Daily memory retention: 7 days active. Current ERRORS.md: 22 active entries (18 tagged, 4 untagged infrastructure logs), 15 archived as [RESOLVED].*
*The anti-pattern I see most: agents that produce great output in demos but show no improvement trajectory over time. The corrections are happening. They're just not being written down.*
Continue reading
Darwin Part 3: How Part 1's Scoring + Part 2's Discovery Became a Compounding Loop
If you're new, this recaps Parts 1 and 2. If you're continuing, this is the payoff: how scoring + discovery connect with memory and research so corrections become permanent upgrades.
Our AI Agent Scored 25% on Its Most Important Skill. Here's How We Fixed It.
We didn't invent self-improving AI. We got curious, studied five frameworks, and adapted them for a problem nobody talks about; agent skills rot.
How I Made My Agent Discover and Create New Agent Skills
Darwin doesn't just improve existing skills. Every week it scans every agent conversation to find capability gaps — and proposes new skills before anyone asks.
Enjoyed this?
Weekly notes on building companies with AI agents — what's working, what's not, and what I'm learning along the way.
Join 50+ founders