Not “using AI to write code” — using AI as a directed implementation agent operating under strict quality constraints. The methodology matters more than the tooling.
Methodology
Practice
Why it matters
TDD-first agentic workflows
Tests encode requirements in executable form that persists beyond the context window. When the model forgets a constraint, the test still enforces it.
User story discipline
Background, foreseeable challenges, explicit acceptance criteria — written before the model sees a prompt. Eliminates the most expensive failure mode: building the wrong thing fast.
Owner/builder separation
I own architecture and requirements; the model implements. LLMs are unreliable at decisions requiring implicit context or business judgment, but excellent at turning clear specifications into working code.
Iterative review
Every output reviewed like a PR from a capable junior developer. Unreviewed AI output accumulates subtle issues that individually seem trivial but collectively degrade a codebase.
Multi-agent role specialization
Specialized agents (PM, analyst, adversarial TDD, coder, scribe) each scoped to a single concern — the same owner/builder separation, extended to agent teams via CrewAI.
Guardrails
Guardrail
What it prevents
Least-privilege MCP scoping
Model gets exactly the tools and access it needs for the current task — prevents accidental writes to wrong systems, limits blast radius of hallucinated actions.
Context anchoring via tests
Constraints stated in early prompts decay as the context window fills. Tests anchor those constraints in executable form that the model can’t silently override.
Specification before delegation
If I can’t explain “done” precisely enough for the model to verify, the requirement isn’t ready to delegate. Ambiguity in, plausible garbage out.
Tools
Tool
Role
Cursor
Primary development environment with deep agentic integration — inline editing, multi-file context, terminal access
Claude (Anthropic)
Primary model for code generation, reasoning, and review. I chose Claude for its instruction-following fidelity and lower tendency to silently override stated constraints.
Model Context Protocol (MCP)
Connects the model to development tools (documentation, APIs, test runners, build systems) with scoped permissions. Transforms the model from generating code in a vacuum to operating within the actual environment.
CrewAI
Multi-agent orchestration framework. Coordinates specialized agent teams (PM, analyst, adversarial TDD, coder, scribe) for end-to-end development workflows.
Where It Accelerates
Production systems I shipped using this methodology as a solo engineer at Decian:
Rust microservice + Next.js PWA — Two distinct technology stacks for Mission Control, shipped in parallel
Go agents + Kafka stream joins — Billing reconciliation with cross-system correlation logic specified in tests before implementation
Observability dashboards — Pipeline-aware monitoring answering operational questions, not just displaying metrics
This digital garden — Site, resume generator, and content pipeline
Current exploration:
Multi-agent ERPNext implementations — Using CrewAI to orchestrate specialized agent teams for end-to-end ERP development workflows. Adversarial TDD agents automate the “tests as guardrails” pattern; a scribe agent maintains a Quartz-based knowledge base for persistent cross-agent context. Full details in the methodology facet →
Where It Doesn’t Help
Situation
Why
Architecture decisions
Depend on implicit organizational context, business constraints, and judgment about what “good enough” means — none of which the model has access to
Requirements gathering
The model will fill ambiguity with confident-sounding assumptions rather than asking clarifying questions
Multi-step reasoning across large codebases
Tasks requiring deep understanding of how distant parts of a system interact still need human-driven reasoning
Novel algorithm design
The model can implement known patterns well but struggles with genuinely novel solutions to unprecedented problems