This blog post is Human-Centered Content: Written by humans for humans.
If you’ve been using Claude Code for more than a few weeks, you probably have the same problem I did: Too much stuff installed.
It starts innocently. You see a plugin that looks useful, so you install it. Someone shares a skill for generating presentations, so you grab that too. A new frontend design plugin catches your eye. Then a code review one. Then a code simplifier. Then a whole Railway deployment suite. Before you know it, your setup has accumulated 30+ skills, a handful of agents, multiple plugins with overlapping capabilities, and Claude is spending its context window just reading the descriptions of everything available to it, and many of them sound similar.
That last part is the real problem. AI models perform worse when given too many options. Every skill and plugin loaded into a session is competing for attention in the context window and you start getting inconsistent results. Which frontend-design skill did it use? The one from Anthropic or the one from Impeccable. Hard to tell. Hard to make sure you’re getting the best results.
I recently sat down to clean house, and what came out the other side is a setup I’m genuinely happy with. Not because I removed everything, but because I got intentional about what stays, what goes and how the pieces coordinate.
What I Kept (and Why)
Two third-party tools earned permanent spots because they genuinely make the work better.
Superpowers is a plugin from Anthropic’s official plugin collection. It provides a set of workflow skills that enforce good development habits: Writing plans before coding, verifying work before claiming completion, dispatching parallel agents for independent tasks, using git worktrees for isolation and structured code review. The verification-before-completion skill alone is worth the install. It enforces a simple rule: No claiming work is done without running the actual verification command and reading the output. Sounds obvious, but AI agents love to say “that should work now” without checking. Superpowers makes that impossible.
Impeccable handles frontend design work. It provides specialized skills for different aspects of UI quality: Typography, color, animation, layout, accessibility audits and more. When you need to build a web interface, “impeccable:frontend-design” produces significantly better results than asking Claude to “make it look good.” And when the initial result needs refinement, skills like “impeccable:bolder” or “impeccable:polish” let you iterate on specific dimensions without starting over.
Everything else I either removed, consolidated or built custom versions that fit how I actually work.
The Coordination Problem
Even with good tools, a single Claude session trying to do everything at once can lead to bad results. When an agent explores the codebase, writes the plan and implements the feature, by the time it gets to testing enough context has been used that it starts forgetting to to the testing, or do it well, or what the success criteria looked like. Long sessions can end up leading the agent astray. The code review passes, because the previous context says this was the right way to code the feature. The result is that quality was inconsistent and sometimes it completely lost track of what we were trying to accomplish.
A big help came from a pattern in project management: hub-and-spoke delegation.
Hub-and-spoke is simple. One central coordinator (the hub) manages the work by dispatching tasks to focused specialists (the spokes). The coordinator doesn’t do the work itself. It selects what needs doing, gives each specialist exactly the context they need, collects the results and enforces quality checks before moving on. The specialists work in isolation with clean context, which means they do their one job well instead of juggling everything.
In Claude Code, this translates to a coordinator agent that dispatches subagents. Each subagent starts with a fresh context window containing only what it needs for its specific task. The coordinator preserves its own context for orchestration rather than burning it on implementation details.
The best part? You can just ask Claude to do it. Tell it to use subagents and it just takes to it. Of course, it can be helpful to specify that there’s a coding agent, a dedicated review agent and a QA specialist too.
The result is sessions that are more likely to stay focused and because each subagent starts clean, the work is more reliable than a single agent trying to hold an entire project in its head.
The Agents
I have four custom agents, each with a specific role. Three of them form the core development pipeline.
Project Initializer
This agent bootstraps new projects. You describe what you want to build, and it decomposes your requirements into granular, testable features stored in a feature_list.json file. It creates the startup script, initializes git and sets up the progress log that future agents use for session handoffs.
The interesting part is what it references. I built two architecture skills that serve as blueprints:
- web-project-architecture defines our default web stack: React, Fastify, TanStack Router and Query, shadcn/ui, Tailwind, Prisma, and PostgreSQL. It includes project structure templates, core patterns (how to fetch data, handle forms, scope database queries), and deployment configuration.
- mcp-project-architecture defines our default MCP server stack: FastMCP 3.0 with Python, using the gateway composition pattern where each service is a standalone server mounted into a central gateway with namespaced tools.
Both default to Railway for deployment and Cloudflare Zero Trust for authentication. They’re opinionated starting points, not mandates. Swap any layer you want, but having a documented default means you’re making conscious deviations instead of arbitrary choices.
The initializer also evaluates whether any features are taste-driven rather than logic-driven. A login form has a correct implementation. A homepage hero section does not. For taste-driven work, it flags features for parallel solutions: the sprint coordinator will later create multiple versions in separate git worktrees so you can compare and choose. More on that in a moment.
Sprint Coordinator
This is the hub. It never writes application code. Its job is orchestration.
When you say “sprint,” it reads the feature list, picks the highest-priority incomplete feature, and drives it through the full pipeline:
- Dispatch an implementer subagent with the feature spec, project context and instructions to implement, test and self-review. The coordinator selects the model based on complexity: Cheaper models for mechanical tasks, more capable ones for integration work.
- Dispatch a spec compliance reviewer that reads the actual code (not the implementer’s report) and verifies it matches the requirements. Nothing missing, nothing extra.
- Dispatch a code quality reviewer that checks for clean code, test quality, and adherence to project patterns.
- Verify with evidence. This is where Superpowers’ verification-before-completion skill kicks in. The coordinator must run tests and read the output before claiming anything passes. No, “should work now.” No trusting subagent reports. Evidence before assertions.
- Hand off to QA or document for the next session.
For taste-driven features flagged by the initializer, the coordinator takes a different path. It creates separate git worktrees (using Superpowers’ using-git-worktrees skill), dispatches parallel implementer subagents each with a different creative direction, and presents the results for you to choose. The implementers use Impeccable’s frontend-design skill for the actual UI work. You pick the version you like, or combine elements from multiple versions, and then it goes through the normal review pipeline.
The session handoff is worth calling out. After each sprint, the coordinator appends detailed notes to claude-progress.txt: What was built, how it was verified, what the next coordinator should pick up, and any warnings or gotchas. This means sessions can restart cleanly. The next time you say “sprint,” a fresh coordinator reads the notes and continues from exactly where the last one left off.
QA Specialist
The final gate. This agent validates completed features by executing acceptance criteria as literal user actions in the browser using the Claude Chrome Extension. It navigates to the relevant page, clicks buttons, fills forms, checks console errors and captures screenshots.
Features either pass (all criteria verified) or fail (returned to the sprint coordinator’s queue with detailed failure notes). The QA specialist follows the same verification-before-completion rule: It doesn’t trust reports. It performs the actions and observes the results.
Refactor Agent
The fourth agent handles migrations and architectural transformations. It’s designed for multi-session work where changes must be incremental and reversible: Characterization tests before refactoring, one logical change per commit, continuous verification after every step. It’s separate from the sprint coordinator because refactoring has fundamentally different constraints than new feature work.
What I Removed
The cleanup wasn’t just about adding structure. I removed real redundancy.
A standalone frontend-design skill that duplicated what Impeccable already does better. A thin /feature skill that was five lines of generic guidance. The original shift-coder agent that tried to do everything the sprint coordinator now delegates to specialists. A code review plugin whose functionality overlapped with Superpowers’ review skills.
The Uncomfortable Truth About AI Tooling
The setup I described is not simple. Four agents, two architecture skills, two third-party plugins, a feature tracking file, a progress log, and a coordination protocol. That’s a lot of scaffolding for what’s supposed to be an AI assistant.
But here’s what I’ve learned: The model is powerful enough. It has been for a while. The gap is in how you organize the work around it. A brilliant developer who gets a vague brief, no review process and no documentation standards will produce inconsistent work. The same is true for AI.
The investment in this setup was one afternoon. But every time I do this, the next project gets a little more consistent, with better architecture, reviews and documentation. For anyone doing serious work with Claude Code, spending a day on your own setup is the highest-leverage thing you can do.
Just don’t install everything you find along the way. Overhauled your own Claude Code setup? I’d love to compare notes.
