Part 1
Our AI Agent Scored 25% on Its Most Important Skill
How we discovered our agents were failing silently, and the scoring loop we built to catch and auto-improve existing skills.
Part 2
How Darwin Discovers and Creates New Agent Skills
The weekly harvest pipeline that reads every agent conversation to find capability gaps and propose new skills before anyone asks.
Part 3
The Full Self-Improvement Loop
How scoring, research, and corrections connect into an agent network that measurably improves every week.