AI & Agentic Workflows
How I Built a Multi-Agent Development Workflow That Writes, Plans, and Implements Specs Autonomously
April 2026 / 12 min read
Posted by Tarek Fawaz
The Problem with "Vibe Coding"
There is a growing trend of developers throwing prompts at AI and hoping for the best. The result is usually a mess - inconsistent architecture, no test coverage, no traceability from requirements to implementation, and code that works in a demo but falls apart in production.
I wanted something different. I wanted AI to follow the same disciplined engineering process that a senior team would: gather requirements, ask clarifying questions, design the architecture, plan the implementation in phases, build it incrementally with reviews, and verify against acceptance criteria. Every step documented. Every decision traceable.
So I built an agentic spec-driven development workflow using Claude Code's VS Code extension - a multi-agent system where each agent has a defined role, defined skills, and defined guardrails. The human stays in the loop at every critical decision point.
The Architecture
The system lives inside the project repository itself. No external tools, no SaaS platforms, no complex infrastructure. Just Markdown files and a well-structured .claude/ directory.
project-root/
├── .claude/
│ ├── CLAUDE.md ← Main orchestrator
│ ├── agents/
│ │ ├── architect.md ← System design & boundaries
│ │ ├── implementer.md ← Code generation
│ │ ├── planner.md ← Phase breakdown & estimation
│ │ ├── reviewer.md ← Quality gates
│ │ ├── specwriter.md ← Requirements gathering
│ │ ├── tester.md ← Acceptance verification
│ │ └── securityagent.md ← Threat modeling
│ └── skills/
│ ├── spec-create/SKILL.md
│ ├── spec-plan/SKILL.md
│ ├── spec-implement/SKILL.md
│ ├── spec-review/SKILL.md
│ ├── spec-test/SKILL.md
│ └── spec-status/SKILL.md
├── specs/
│ ├── README.md ← Spec registry & status
│ ├── 001-feature-name.md ← Spec documents
│ ├── plan/ ← Implementation plans
│ └── review/ ← Review findings
└── src/ ← Implementation codeThe Agents
Each agent has a narrowly scoped role with its own system prompt, reasoning level, and guardrails. They collaborate through the orchestrator, not directly.
Spec Writer
Requirements engineer. Asks probing questions, challenges assumptions, negotiates scope with the human.
Architect
Thinks in systems. Reviews designs against existing architecture, flags breaking changes.
Planner
Breaks specs into implementation phases with dependencies, effort estimates, and priorities.
Implementer
Writes code following the plan, respecting architectural decisions and project conventions.
Reviewer
Code review with senior engineer rigor. Checks architecture compliance and acceptance criteria.
Tester
Verifies implementation against every acceptance criterion. Generates test strategies.
Security Agent
Participates at every stage. Threat modeling, auth patterns, vulnerability scanning.
The Skills (Workflow Chains)
Skills are slash commands that chain agents together into a repeatable process. Each skill has a defined input, a defined agent pipeline, and a defined output artifact.
- 1
/spec-create
Human provides problem statement → Spec Writer + Architect + Security Agent engage in dialogue → Output: numbered spec file registered in specs/README.md
- 2
/spec-plan
Takes spec ID → Planner + Architect + Security Agent → Human gate if guardrails are broken → Output: phased implementation plan in specs/plan/
- 3
/spec-implement
Auto or interactive mode → Implementer + Tester + Reviewer invoked per phase based on complexity → Output: working code
- 4
/spec-review
Reviewer + Security Agent + Architect → Output: review document in specs/review/
- 5
/spec-test
Tester verifies all acceptance criteria → Output: pass/fail report
- 6
/spec-status
Reads specs/README.md → Output: current state of all specs
Human-in-the-Loop Design
This is not an autonomous agent that runs unsupervised. Every critical transition has a human gate:
- Spec creation: The Spec Writer asks questions until the human explicitly confirms scope.
- Planning: If the plan breaks architectural guardrails defined in CLAUDE.md, the human must approve.
- Implementation mode: The human chooses automatic (phase-by-phase) vs. interactive (confirm each phase).
- Review findings: Critical findings require human acknowledgment before proceeding.
- Test failures: The human decides whether to fix, defer, or accept.
The philosophy is that AI handles the mechanical work - research, drafting, code generation, test writing - while humans make the judgment calls about scope, trade-offs, and risk acceptance.
Model Selection by Reasoning Level
Not every agent needs the same firepower. Matching reasoning level to role keeps the system responsive without sacrificing quality where it counts.
| Agent | Reasoning Level | Why |
|---|---|---|
| Spec Writer | High | Handles ambiguity, asks good questions |
| Architect | Highest | Architectural decisions hardest to reverse |
| Planner | Medium-High | Dependencies and sequencing |
| Implementer | Medium → High | Standard patterns vs complex algorithms |
| Reviewer | High | Spots subtle issues |
| Tester | Medium | Test patterns are structured |
| Security Agent | High | Requires adversarial thinking |
What This Looks Like in Practice
Creating a spec:
> /spec-create
You: "We need a graph builder component that lets users
visually construct workflows using drag-and-drop nodes
and edges, with real-time C# code generation."
Spec Writer: "Let me understand the scope. A few questions:
1. Should the graph support cycles, or is it a DAG only?
2. What's the target: browser-based React, or desktop?
3. Should code generation happen on every change or on-demand?
4. What node types are needed for v1?"
[dialogue continues until scope is confirmed]
→ Generated: specs/001-graph-builder.md
→ Updated: specs/README.md (status: drafted)Planning the implementation:
> /spec-plan 001
Phase 1: Core graph data model (2 days)
Dependencies: None
Phase 2: React Flow canvas integration (3 days)
Dependencies: Phase 1
Phase 3: Code generation engine (4 days)
Dependencies: Phase 1, Phase 2
⚠️ GUARDRAIL: Phase 3 introduces Scriban as new dependency.
Human approval required.
You: "Approved - Scriban is the right choice."
→ Generated: specs/plan/001-plan.md
→ Updated: specs/README.md (status: planned)Executing the implementation:
> /spec-implement 001 --mode interactive
Implementer: "Starting Phase 1: Core graph data model.
Creating Node.cs, Edge.cs, Port.cs, GraphState.cs..."
Tester: "12 unit tests generated. All passing."
"Phase 1 complete. Proceed to Phase 2?"
You: "Yes, proceed."
[continues through all phases with review + test per phase]Why This Matters
This workflow produces something most AI-generated code does not have: traceability. Every line of code traces back to a spec, which traces back to a problem statement, which was validated through human dialogue. Every architectural decision is documented. Every phase has a plan. Every implementation has a review.
When you come back to this code six months later, you do not have to wonder "why was it built this way?" The specs, plans, and reviews tell the full story.
For teams, this becomes even more powerful. A new developer can read the spec, understand the plan, review the implementation, and see the test results - all without asking a single question.
Building Your Own
The entire system is portable. Adapt it to any project by:
- Writing your
CLAUDE.mdwith project-specific conventions and guardrails - Defining agents that match your team's roles and expertise needs
- Creating skills that enforce your development process
- Setting up the
specs/directory structure
The process is the product. The agents and skills encode your engineering culture into a repeatable, auditable workflow. The AI handles the execution. You keep the judgment.
At TM-Tech Alliance, we build agentic AI workflows and developer tooling. If you are interested in implementing structured AI-assisted development for your team,
Share this post
Instagram doesn't support direct web sharing — we copy a ready-to-paste caption to your clipboard.
