Self-Improvement Loops for AI Tools

I use Claude Code daily. After a few weeks, I noticed a pattern: I kept correcting the same mistakes. Claude would forget context, miss project conventions, or suggest approaches we’d already tried.

The fix wasn’t better prompts. It was building feedback loops.

The Core Insight

Every session is an opportunity to make things better for the next session.

When Claude makes a mistake, that’s not just a problem to fix—it’s a signal. Something should change so the mistake never happens again. The question is: where does that change live?

Three Types of Improvements

I settled on three categories:

Type	What Changes	Example
Knowledge	Facts to reference	”Composio SDK requires toolkit version, not just toolkit name”
Context	Instructions to follow	”Always check for existing patterns before creating new utilities”
Infrastructure	New capabilities	A command that automates a repeated workflow

Knowledge is passive—Claude references it when relevant. Context is active—Claude follows it as instructions. Infrastructure is structural—new tools that didn’t exist before.

The /publish Command

I built a command called /publish that runs at the end of meaningful sessions. It does three things:

Git operations - Stage, commit, push changes
Documentation check - Do any READMEs need updating?
Improvement check - What should persist for next time?

The improvement check is the key part. It asks: what did we learn? What did the user correct? Should this become a reusable command?

All changes require approval. The human stays in the loop. This prevents the system from getting worse through bad generalizations.

What Actually Gets Captured

Here’s what triggers an improvement:

User corrections → If you tell Claude “that’s not how we do it here,” that correction should persist. Otherwise you’ll have the same conversation again.

SDK gotchas → When documentation differs from actual behavior, that gap needs recording. Future sessions shouldn’t have to rediscover it.

Repeated workflows → If you’ve done the same sequence of steps three times, it should probably become a command.

Project conventions → Team preferences, architectural decisions, naming patterns. Things that aren’t wrong elsewhere but are wrong here.

The Feedback Loop in Practice

Session work → /publish → Identify improvements → User approves → Apply changes
                              ↓
                    Knowledge files (facts to reference)
                    Claude context (instructions to follow)
                    Infrastructure (new commands)
                              ↓
                    Next session starts smarter

The key is that improvements happen in real-time during the session, not batched at the end. When something goes wrong, I fix the immediate problem and then immediately consider: should this persist?

/publish is a verification checkpoint, not the only time learning happens.

Results

After a month of this approach:

Fewer repeated corrections
Claude suggests patterns consistent with my codebase
New projects scaffold with learned conventions built-in
Commands handle workflows I used to explain manually

The system compounds. Each improvement makes the next session slightly better. Over enough sessions, the difference is significant.

The Philosophy

I have a line in my Claude instructions: “Every day we get a little bit better.”

It sounds simple. But systematizing it—building the infrastructure so improvements actually persist—took intentional design. The tools had to exist. The habits had to form.

Now it’s automatic. End of session, run /publish, capture what matters, move on. Tomorrow’s Claude is slightly smarter than today’s.