Build noteApr 6, 20267 min read

The Memory Architecture: How Corrections Compound

Without a memory architecture, every AI agent correction evaporates at session end. Same mistake, next week. Borges closes the loop — a Monday correction becomes a permanent skill improvement by Wednesday.

ShareX / TwitterLinkedIn
The Memory Architecture: How Corrections Compound

*Without a memory architecture, every correction evaporates at session end. Same mistake, next week, same frustration. Borges closes the loop — a Monday correction becomes a permanent skill improvement by Wednesday.*

---

The Correction That Keeps Happening

I have seven AI agents. When I give them feedback, they hear it. They adjust for that session. And then the session ends.

Next week, the same mistake.

This cycle is not a model failure. It's a memory architecture failure. The feedback loop between human judgment and system improvement doesn't close unless you deliberately close it. (And nobody tells you this when you start building with agents. The demos don't show the correction loop. They show the capability.)

Borges is the architecture I built to close that loop. (Named for Jorge Luis Borges — the Argentine author who wrote about infinite libraries, labyrinthine memory systems, and stories where every moment branches into infinite parallel possibilities. The naming is slightly pretentious and I stand by it entirely.)

What "Compounding Corrections" Actually Means

A correction that compounds works like this:

The 6-Stage Correction Flow — from human feedback to permanent skill change
The 6-Stage Correction Flow — from human feedback to permanent skill change

Monday: I correct APRIL's output — the draft cited a source URL that had 404'd.

Monday, 20 seconds later: APRIL writes to `shared/ERRORS.md`: ``` ## 2026-03-31: Source URL 404'd in published draft (APRIL) [DARWIN: blog-exec] - Error: Draft cited arxiv.org/abs/2601.XXXX — URL returned 404 at time of review - Fix/Rule: Validate all external URLs resolve before draft moves to review stage - Severity: medium ```

APRIL writes this *before* replying to me. Not after. Before. (The ordering is the whole thing. If the agent replies first, the session can get compacted — context truncated — and the error note never gets written. DO → WRITE → REPLY is a boot-level rule.)

Monday night: Borges runs nightly consolidation. Cross-agent rule added to LEARNINGS.md.

Sunday 1:30 AM: The weekly harvest reads ERRORS.md. The blog-exec error appears in Section 3 (Rule Violations).

Sunday 2:00 AM: Darwin mutates blog-exec. Sandboxes. Scores. Promotes.

Following Monday: APRIL's blog-exec skill now includes URL validation. The 404 mistake doesn't happen again. Not because I remembered. Because the skill itself changed.

The Architecture: What Borges Actually Is

Borges is not a single file. It's a set of behaviors:

  • The correction entry point: `shared/ERRORS.md`
  • The cross-agent knowledge base: `shared/LEARNINGS.md`
  • The nightly consolidation: reads session memory across all agents, extracts durable lessons
  • The daily memory logs: `memory/YYYY-MM-DD.md` for session continuity
  • The Darwin bridge: `[DARWIN: skill-name]` tags route errors to the optimization queue

Every agent reads ERRORS.md and LEARNINGS.md at session start. Not as a courtesy — as a boot requirement, before any task begins.

The [DARWIN: skill-name] Tag

Not every error is a skill prompt failure. The tag separates signal from noise:

ERRORS.md Format — properly tagged vs. not tagged
ERRORS.md Format — properly tagged vs. not tagged

Tag this: Agent skipped a rule written in SKILL.md. Output missing a required section. Wrong format, wrong sourcing — things that exist as rules.

Don't tag this: Script crashed. API down. Model hallucinated a fact. Not fixable via skill mutation.

The practical effect: Darwin's queue only contains work it can actually do. Without this filter, Darwin would waste cycles on infrastructure failures while real skill prompt failures accumulate untreated.

Nightly Consolidation

Every night, Borges reads all agent daily memory files and answers three questions:

1. What should become a cross-agent rule? A correction to APRIL that would also apply to Scout or Jarvis gets promoted to LEARNINGS.md. "DO → WRITE → REPLY" started as an APRIL-specific rule, became cross-agent after Borges noticed the same failure in three agents.

2. What errors have been resolved? Entries with corresponding Darwin mutations get tagged `[RESOLVED]` and archived. Keeps ERRORS.md manageable — agents read it at boot.

3. What memory is worth keeping? Daily logs stay active 7 days. After that, Borges extracts what generalizes before the logs expire. (Getting this filtering right is genuinely hard. Too aggressive: you lose context. Too conservative: agents read a novel every morning.)

The Boot Sequence

Every agent has this at the top of AGENTS.md:

``` BEFORE STARTING ANY TASK: 1. Read shared/ERRORS.md — avoid known mistakes 2. Read shared/LEARNINGS.md — apply cross-agent rules 3. Read memory/[today].md and memory/[yesterday].md — continuity 4. If task involves a specific skill, read that skill's SKILL.md

DO → WRITE → REPLY When corrected, write to shared/ERRORS.md BEFORE replying. [DARWIN: skill-name] tag if it's a skill prompt failure. ```

This is why a Monday correction reaches all agents by Tuesday. LEARNINGS.md is updated Monday night. Tuesday morning, all seven agents read it before starting work.

What Breaks the Loop (The Anti-Patterns)

Three things destroy the correction-compounding loop:

1. Replying before writing. Agent receives correction → replies with adjusted output → session compacts → correction never written. Same mistake next week.

2. Untagged errors. ERRORS.md without [DARWIN: skill-name] is invisible to Darwin. Problems accumulate untreated.

3. Vague corrections. "The output wasn't great" produces vague mutations. Every entry needs a specific failure description, a specific rule, and a specific skill target.

The Compounding Loop

The Compounding Loop — corrections are investments, not rent
The Compounding Loop — corrections are investments, not rent

The shift isn't technical. It's philosophical.

Before this architecture: corrections are maintenance. You correct an agent, it improves for a session, the session ends, you maintain the same correction again next week.

After this architecture: corrections are investments. One well-tagged correction → one ERRORS.md entry → harvest signal → Darwin mutation → permanent skill change. The same mistake costs you once.

Five months of well-tagged corrections produces a skill set optimized by hundreds of specific production failures. Five months of untagged corrections produces the same agents you started with.

Build This in Your Own System

Step 1: Create shared/ERRORS.md. One file. All agents write here.

Step 2: Create shared/LEARNINGS.md. Seed it with your 5 most important lessons from last month.

Step 3: Add the boot sequence to AGENTS.md. Read ERRORS + LEARNINGS before any task.

Step 4: Add DO → WRITE → REPLY as a hard rule. Not a guideline.

Step 5: Tag skill prompt failures with [DARWIN: skill-name]. Even without automation, the tags help you manually identify which skills need work.

Step 6: Once a week, read ERRORS.md. Look for patterns. What appears more than once? Add that rule to the SKILL.md.

That's the manual Borges loop. I ran it for three weeks before automating. That was the right call — you learn a lot about what the system should catch when you're doing the reading yourself first.

---

*Borges runs nightly at 23:30 IST across all agent workspaces on OpenClaw. Daily memory retention: 7 days active. Current ERRORS.md: 22 active entries (18 tagged, 4 untagged infrastructure logs), 15 archived as [RESOLVED].*

*The anti-pattern I see most: agents that produce great output in demos but show no improvement trajectory over time. The corrections are happening. They're just not being written down.*

AK

Arif Khan

Founder building companies where humans and AI agents have real jobs. Writing about what actually works.


Continue reading

Enjoyed this?

Weekly notes on building companies with AI agents — what's working, what's not, and what I'm learning along the way.

Join 50+ founders