The Cascade
What Is It
The Cascade is a development methodology for building production software with AI coding agents. It applies waterfall's sequential discipline at feature scale — each feature cascades through a fixed sequence of phases — while running multiple cascades in parallel across sprint cycles.
Traditional waterfall fails because it applies rigid sequencing to an entire project. Agile succeeds by breaking work into small increments but often loses architectural rigor. The Cascade takes the best of both: every feature gets the full waterfall treatment, but each waterfall is small enough to complete in days, not months.
Sprint N
├── Feature A: ████████░░░░░░░░░░░░ (Implement → Review)
├── Feature B: ████████████████░░░░ (Review → Compound)
├── Feature C: ██░░░░░░░░░░░░░░░░░░ (Research → Specify)
└── Bug Fix D: ████████████████████ (Implement → Done)
Core thesis: the specification is the product, and the code is a derivative.
The spec is the source of truth — not the code. If the code doesn't match the spec, the code is wrong. This inverts how most developers think about software, where code is the authoritative artifact and docs are an afterthought that drifts out of date.
With AI coding agents, this inversion becomes practical for three reasons:
- AI output quality is proportional to input quality. A vague prompt produces vague code. A precise spec produces precise code. The spec is the investment.
- Code is cheap to regenerate, specs are not. An agent can rewrite an implementation in minutes. The thinking behind what to build and why — the problem framing, the boundary decisions, the edge cases — that's the hard work.
- Specs are human-reviewable, code volume is not. A solo developer using AI agents can generate more code in a day than they can meaningfully review. But they can review a spec. If the spec is right, the code review becomes verification against the spec rather than reasoning about intent from scratch.
The Cascade is built around this: every phase before Implementation is about getting the spec right. Implementation is mechanical — the TDD loop just translates the spec into tested code.
The Six Phases
Each feature cascades through six phases in strict order. No phase is skipped for medium+ features. Each phase produces specific artifacts and ends at a gate before the next begins.
┌──────────────┐
│ 0: RESEARCH │ Understand the landscape
│ Artifact: Research Brief
└──────┬───────┘
│
┌──────▼───────┐
│ 1: SPECIFY │ Define WHAT and WHY
│ Artifact: Product Spec
│ Optional: Prototype → Prototype Spec
└──────┬───────┘
│
◆ Human Review Gate ◆
│
┌──────▼───────┐
│ 2: DESIGN │ Define HOW
│ Artifacts: Technical Design + ADRs + Task Breakdown
└──────┬───────┘
│
◆ Human Review Gate ◆
│
◆ /grill-me Exit Gate ◆
│
┌──────▼───────┐
│ 3: IMPLEMENT │ TDD loop per task (fresh session each)
│ Artifacts: Tests + Code + PLANS.md
└──────┬───────┘
│
◆ Automated Verification ◆
│
┌──────▼───────┐
│ 4: REVIEW │ Fresh-context code review
│ Artifact: Review Notes
└──────┬───────┘
│
┌──────▼───────┐
│ 5: COMPOUND │ Capture learnings, update rules, ship
│ Artifacts: Updated rules + memory + commit
└──────────────┘
Phase 0: Research
Goal: Understand the landscape before designing anything.
Process:
- Define the research question
- Launch parallel research subagents (competitive, technical, prior art)
- Search GitHub for existing implementations, skeleton projects, libraries
- Search package registries before writing utility code
- Synthesize findings into a Research Brief
Artifact: Research Brief
---
type: research-brief
date: YYYY-MM-DD
project: <project name>
status: draft
---
# Research Brief: <Feature/Product Name>
## Research Question
What are we trying to learn?
## Competitive Landscape
| Product | Approach | Strengths | Gaps |
|---------|----------|-----------|------|
## Prior Art
- Existing implementations found
- Libraries/packages that solve part of the problem
- Open source projects worth forking/adapting
## Technical Patterns
- Common architectural approaches
- Proven patterns from similar systems
## Key Insights
- What should influence our design?
- What should we avoid?
## Recommendation
Build vs. buy vs. adapt decision with rationale
When to skip: Small features, bug fixes, well-understood domains.
Agent Pipeline:
Researcher (parallel subagents) → Librarian (file + tag + link)
- Multiple Researcher agents run in parallel investigating different aspects (competitive, technical, prior art)
- Librarian files the Research Brief, updates activity logs, and creates wiki-links to related vault content
Phase 1: Specify
Goal: Define WHAT to build and WHY. No implementation details — just behavior, requirements, and success criteria.
Process:
- Start in Plan Mode or a dedicated spec session
- Use the interview pattern for medium+ features:
I want to build [brief description]. Interview me in detail. Ask about users, use cases, edge cases, concerns, and tradeoffs. Don't ask obvious questions — dig into the hard parts I might not have considered. Keep interviewing until we've covered everything. Then write a complete Product Spec. - For smaller features, write the spec directly or have Claude draft from your notes
- Review the spec carefully — this is the most important document
- (Optional) Build a throwaway Prototype to visualize the product or feature
- The prototype is a visual exploration tool — it demonstrates what the product could be and the functionality it could have
- No prototype code advances into Phase 2. The prototype is discarded before Design begins
- If the prototype reveals valuable insights, produce a Prototype Spec documenting what worked, what didn't, and what carries forward as requirements
- The Prototype Spec becomes supporting material for the Design phase alongside the Product Spec
Artifact: Product Spec
---
type: product-spec
date: YYYY-MM-DD
project: <project name>
status: draft | in-review | approved
tags: []
---
# Product Spec: <Feature Name>
## Problem Statement
What problem are we solving and for whom?
## User Stories
- As a [user], I want to [action], so that [value]
- Include acceptance criteria for each story
## Functional Requirements
### Phase 1 (MVP)
- Requirement with testable acceptance criteria
### Phase 2 (Enhancement)
- Requirement with testable acceptance criteria
## Non-Functional Requirements
- Performance: <specific, measurable targets>
- Security: <specific requirements>
- Scale: <expected load, data volumes>
## Boundaries
### DO NOT CHANGE
- Stable components that must not be modified
- Existing API contracts to preserve
- Database schemas to respect
### Must Ask Before Changing
- Architectural decisions that need human approval
- External service integrations
## Edge Cases
- What happens when X?
- How should the system handle Y?
## Success Metrics
- How do we know this works? (testable)
## Open Questions
- Unresolved decisions that need input
Artifact: Prototype Spec (optional — only when a prototype was built)
---
type: prototype-spec
date: YYYY-MM-DD
project: <project name>
prototype-status: discarded
---
# Prototype Spec: <Feature Name>
## Prototype Overview
What was built and what it demonstrated
## What Worked
- Functionality or UX patterns that validated well
- Interactions that felt right
- These carry forward as requirements into Design
## What Didn't Work
- Approaches that were confusing, slow, or wrong
- Anti-patterns discovered during exploration
- These are explicitly excluded from Design
## Insights
- Surprises or discoveries not anticipated in the Product Spec
- Revised assumptions based on seeing the feature in action
## Recommendations for Design
- Specific guidance for the Technical Design phase based on prototype learnings
Key principles for AI-optimized specs:
- Be explicit — AI cannot infer unstated requirements
- Include "DO NOT CHANGE" sections — AI will cheerfully rewrite stable code unless told not to
- Use phased delivery — prevents the agent from attempting too much at once
- Include examples — concrete input/output scenarios dramatically improve implementation quality
- Define boundaries — what NOT to do is often more valuable than what to do
Agent Pipeline:
Product Manager (spec writing, interview) → Tech Writer (polish) → Librarian (file + tag + link)
- Product Manager drives the interview pattern, writes the Product Spec with requirements, boundaries, and acceptance criteria
- Tech Writer polishes the spec into a clean, professional document
- Librarian files the Product Spec (and Prototype Spec if applicable), updates activity logs
- Prototype building (when used) is direct human + Claude work — no agent pipeline, vibe-code style
Session management: Spec creation can consume significant context. After the spec is written, /clear and start fresh for the Design phase.
Phase 2: Design
Goal: Define HOW to build it. Architecture, tech stack, data models, API contracts, and a sequenced task breakdown.
Process:
- Start a fresh session with the approved Product Spec (and Prototype Spec, if one exists)
- Use Plan Mode for architecture exploration:
Read the Product Spec at <path>. Create a Technical Design document covering architecture, data models, API contracts, and key decisions. Then create ADRs for each significant architectural choice. Finally, break the implementation into sequenced tasks with dependencies. - Review all three artifacts before moving to implementation
Artifact: Technical Design
---
type: technical-design
date: YYYY-MM-DD
project: <project name>
spec: "[[Product Spec Name]]"
status: draft | approved
---
# Technical Design: <Feature Name>
## Architecture Overview
High-level architecture with component diagram (Mermaid or ASCII)
## Technology Stack
- Language/framework with specific versions
- Dependencies with version constraints
- Infrastructure requirements
## Data Models
Schema definitions with types and constraints
## API Contracts
Endpoint definitions with request/response schemas
## Component Design
For each major component:
- Interface (inputs, outputs)
- Internal behavior
- Dependencies
- Error handling strategy
## Integration Points
How this connects to existing systems
## Security Considerations
Authentication, authorization, data protection
## Performance Targets
Specific, measurable benchmarks
## Testing Strategy
- Unit test approach
- Integration test approach
- What to mock, what to test against real services
Artifact: Architecture Decision Records
---
type: adr
date: YYYY-MM-DD
status: proposed | accepted | deprecated
---
# ADR-NNN: <Decision Title>
## Context
Forces at play, problem description
## Decision
The change we're making
## Consequences
What becomes easier, what becomes harder, trade-offs accepted
Artifact: Task Breakdown
---
type: task-breakdown
date: YYYY-MM-DD
spec: "[[Product Spec Name]]"
design: "[[Technical Design Name]]"
---
# Task Breakdown: <Feature Name>
## Task 1: <Title>
- **Description:** What to implement
- **Dependencies:** None | Task N
- **Files:** Files to create or modify
- **Acceptance Criteria:** Testable conditions
- **Verification:** Test command to run
- **Status:** pending | in-progress | done
- **Notes:** (updated during implementation)
## Task 2: <Title>
...
Key principles:
- Tasks should be small enough for a single session (clean context)
- Each task has explicit verification criteria
- Dependencies are ordered — no task starts until its dependencies are verified
- The "Notes" field is updated during implementation (persistent storage pattern)
Agent Pipeline:
Planner (architecture + task sequencing) → Product Manager (ADRs) → Tech Writer (design doc) → Librarian (file + tag + link) → grill-me (exit gate)
- Planner drives architecture exploration, component decomposition, and task sequencing
- Product Manager writes ADRs for each significant architectural choice
- Tech Writer produces the polished Technical Design document
- Librarian files all artifacts, updates activity logs, and cross-links to the Product Spec
- Exit gate: Run
/grill-meagainst the Technical Design and Task Breakdown. The skill interviews relentlessly about every architectural decision, dependency, and trade-off. Design is not complete until grill-me reaches a shared understanding with no open questions. See Appendix A: AI Skills for the full skill definition
Phase 3: Implement
Goal: Execute tasks one at a time using TDD. Human writes tests (or test specs), AI implements.
Process (per task):
- Start a fresh session — clean context focused on this one task
- Load only the relevant context:
Read the Task Breakdown at <path>. We're implementing Task N. Read the Product Spec at <path> for requirements context. Read the Technical Design at <path> for architecture decisions. - Write tests first (or have Claude write tests from acceptance criteria, then review):
Write tests for Task N based on the acceptance criteria. Do NOT implement yet — just the tests. They should all fail. - Implement to pass tests:
Now implement Task N. Make all tests pass. Run the full test suite and fix any regressions. - Update the task status and notes in the Task Breakdown
/clearbefore the next task
For complex tasks — use PLANS.md:
# PLANS.md
## Goal
What this implementation session achieves
## Steps
- [ ] Step 1
- [ ] Step 2
- [ ] Step 3
## Progress
(Updated as work proceeds)
## Surprises & Discoveries
(Unexpected findings during implementation)
## Decision Log
(Choices made during execution and rationale)
TDD rules:
- Never let AI modify tests to make them pass
- Tests must run in seconds — long test suites kill the feedback loop
- One test at a time for critical logic
- Review AI-written tests before trusting them as specification
Session management:
- One fresh session per task or small task group
- Never let a single session span unrelated tasks
- Use subagents for investigation so exploration doesn't fill main context
- If you've corrected the agent twice on the same issue,
/clearand restart with a better prompt
Agent Pipeline:
tdd-guide (per task TDD loop)
- tdd-guide enforces the write-tests-first discipline for each task
- No other agents in this phase — implementation is a tight loop between the human, the agent, and the test suite
Phase 4: Review
Goal: Code review in a fresh context, free from implementation bias.
Process:
- Start a fresh session (separate from the implementation session)
- Point the reviewer at the changes:
Review the changes for <feature>. Check for: - Correctness against the Product Spec at <path> - Edge cases and error handling - Security vulnerabilities - Performance concerns - Code quality and maintainability - Consistency with existing patterns - Address findings, re-run tests
Why fresh context matters: An AI agent in a fresh session will catch issues it overlooked during implementation because it's not anchored to its own reasoning from the writing session.
For larger features: Use parallel sessions with git worktrees:
- Session A: Implements
- Session B: Reviews Session A's output
- Session A: Addresses review feedback
Agent Pipeline:
All reviewers run in parallel (fresh context):
├── code-reviewer (correctness, quality, patterns)
├── security-reviewer (vulnerabilities, auth, data protection)
├── python-reviewer (Python idioms, typing, packaging)
├── go-reviewer (Go idioms, error handling, concurrency)
├── rust-reviewer (ownership, lifetimes, unsafe usage)
└── cpp-reviewer (memory safety, RAII, modern C++ practices)
- All reviewers launch in parallel in a fresh session — none share context with the implementation session
- Language-specific reviewers activate based on which languages are present in the changeset
- Each reviewer produces findings independently; findings are merged and deduplicated before addressing
Phase 5: Compound
Goal: Capture learnings, update project context, commit and ship.
Process:
- Update rules files with any new patterns discovered
- Update CLAUDE.md if new conventions emerged
- Save learnings to memory for future sessions
- Commit with descriptive message
- Update activity logs
What to capture:
- New coding patterns that worked well → add to rules
- Gotchas or non-obvious behaviors → add to CLAUDE.md
- Workflow preferences confirmed or corrected → save to memory
- Libraries or approaches that proved valuable → note for future reference
This is the compound learning loop. Each cascade makes the next one better because the project's context files grow more precise and the agent needs less correction.
Agent Pipeline:
Librarian (file + tag + link + activity logs)
- Librarian files any new artifacts, updates activity logs, and ensures cross-links are current
- Rule and memory updates are done directly by the human + Claude — no agent delegation for learning capture
Scaling: When to Use What
Feature Size → Process Depth
| Size | Example | Phases | Duration |
|---|---|---|---|
| Trivial | Fix typo, add log line | Phase 3 only | Minutes |
| Small | Add endpoint, UI tweak | 1 → 3 → 5 | < 1 hour |
| Medium | New feature, API redesign | 1 → 2 → 3 → 4 → 5 | Hours |
| Large | New module, system integration | 0 → 1 → 2 → 3 → 4 → 5 | Days |
| Greenfield | New project from scratch | All phases, multiple cycles | Weeks |
Decision Framework
Is it a one-line fix?
└─ Yes → Just do it (Phase 3 only)
└─ No → Can you describe the diff in one sentence?
└─ Yes → Brief spec + implement (Phase 1 light + Phase 3)
└─ No → Does it touch multiple files or systems?
└─ No → Spec + implement + review (Phase 1 + 3 + 4)
└─ Yes → Full cascade (Phase 0-5)
Session Architecture
Each phase gets its own session. Context is cleared between phases. The artifact from one phase becomes the input to the next — this is the cascade.
Session 1: Research (Phase 0)
└─ Subagents do parallel research
└─ Output: Research Brief
└─ /clear
Session 2: Specification (Phase 1)
└─ Interview pattern → Product Spec
└─ Optional: Prototype → Prototype Spec
└─ /clear
Session 3: Design (Phase 2)
└─ Load spec → Technical Design + ADRs + Task Breakdown
└─ /grill-me exit gate
└─ /clear
Session 4..N: Implementation (Phase 3, one per task)
└─ Load task + spec + design → TDD loop
└─ /clear after each task
Session N+1: Review (Phase 4)
└─ Fresh context → review all changes
└─ /clear
Session N+2: Compound (Phase 5)
└─ Capture learnings → commit → ship
Project Folder Structure
Cascade artifacts live in a versioned sdd/ folder at the project root. Each version is a complete, self-contained snapshot of the spec-driven design at that point in time.
<project-root>/
└── sdd/
├── v1/
│ ├── research/ ← Research Briefs
│ ├── specs/ ← Product Specs, Prototype Specs, Task Breakdowns
│ └── docs/ ← Technical Designs, ADRs
│ └── adr/
├── v2/
│ ├── research/
│ ├── specs/
│ └── docs/
│ └── adr/
└── ...
Versioning rule: copy forward. When a new version is cut, every artifact from the prior version is copied into the new version folder as the baseline. Changes happen in the new version; prior versions are frozen.
This preserves a clean, navigable history of how the design evolved. Reading any vX/ folder gives you the complete picture of the system as it was specified at that version — no need to reconstruct state by tracing diffs across folders.
When to bump versions:
- Major product release or milestone
- Significant scope change that invalidates prior design assumptions
- A new "phase" of the product where prior specs no longer reflect intent
Trivial edits and corrections happen in place within the current version.
Artifact Map
All Artifacts by Phase
| Phase | Artifact | Format | Location | Consumer |
|---|---|---|---|---|
| 0 Research | Research Brief | Markdown | sdd/vX/research/ |
Human + AI |
| 1 Specify | Product Spec | Markdown | sdd/vX/specs/ |
Human + AI |
| 1 Specify | Prototype Spec (optional) | Markdown | sdd/vX/specs/ |
Human + AI |
| 2 Design | Technical Design | Markdown | sdd/vX/docs/ |
Human + AI |
| 2 Design | ADRs | Markdown | sdd/vX/docs/adr/ |
Human + AI |
| 2 Design | Task Breakdown | Markdown | sdd/vX/specs/ |
AI (primary) |
| 3 Implement | Tests | Code | Project test directory | AI + CI |
| 3 Implement | Code | Code | Project source directory | AI + Human |
| 3 Implement | PLANS.md | Markdown | Project root (temporary) | AI |
| 4 Review | Review Notes | In-session | Ephemeral (or saved) | Human |
| 5 Compound | Updated Rules | Markdown | .claude/rules/ |
AI |
| 5 Compound | Updated CLAUDE.md | Markdown | Project root | AI |
| 5 Compound | Memory entries | Markdown | .claude/projects/*/memory/ |
AI |
Artifact Flow
Research Brief ─────► Product Spec ──────┬──► Technical Design
Prototype Spec ────┘ │
│ ├──► ADRs
│ │
└───────────────────────┴──► Task Breakdown
│
/grill-me gate
│
┌─────┼─────┐
│ │ │
Task 1 Task 2 Task N
│ │ │
Tests Tests Tests
│ │ │
Code Code Code
│ │ │
└─────┼─────┘
│
Review
│
┌─────┼─────┐
│ │ │
Rules Memory CLAUDE.md
(compound learning loop)
Hook Integration
| Hook | Phase | Purpose |
|---|---|---|
| PreToolUse:Edit on test files | 3 Implement | Block AI from modifying tests to pass them |
| PostToolUse:Edit | 3 Implement | Auto-lint after every file edit |
| PostToolUse:Write | 3 Implement | Auto-format new files |
| Stop | 5 Compound | Extract learnings before session ends |
Parallel Execution Patterns
| Pattern | Phase | How |
|---|---|---|
| Parallel research | 0 Research | Multiple subagents investigate different aspects simultaneously |
| Fan-out migration | 3 Implement (batch) | claude -p "Migrate $file" in parallel across files |
| Writer/Reviewer | 3-4 | Implementation + review in separate worktrees |
| Multi-perspective review | 4 Review | Security + code + language-specific reviewers in parallel |
Principles
Seven principles underpin The Cascade. These emerged independently across every production-grade AI development methodology:
| # | Principle | Why It Matters |
|---|---|---|
| 1 | Specify before implementing | AI output quality is directly proportional to input quality. |
| 2 | Structure context deliberately | Context window is the #1 constraint. Performance degrades as it fills. |
| 3 | Work in small increments | One function, one feature, one test at a time. Prevents compounding errors. |
| 4 | Maintain human review gates | AI generates, humans validate. |
| 5 | Automate verification | Tests, linters, CI/CD as safety nets. The single highest-leverage practice. |
| 6 | Document for the AI, not just humans | Explicit, example-driven, boundary-aware specs. AI cannot infer unstated requirements. |
| 7 | Feed learnings back | Compound knowledge over time. Rules, memory, and learnings create a virtuous cycle. |
Appendix A: AI Skills
Skills are reusable prompts invoked via slash commands during a cascade. They are distinct from agents — an agent is a persona with capabilities and routing rules, while a skill is a structured prompt that can be run by any agent or directly by the human.
/grill-me — Design Interview
Used as: Exit gate for Phase 2 (Design)
Purpose: Relentlessly interview every aspect of the implementation until all design decisions are resolved. Catches gaps, vague thinking, and unstated assumptions before code moves to Review.
Prompt:
---
name: grill-me
description: Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".
---
You are a rigorous design interviewer. Your job is to walk down every branch of the
design tree for the plan or idea I've described, resolving dependencies between
decisions one by one.
How to Conduct the Interview:
1. Identify the design tree: Break the plan into its major decision branches
(architecture, scope, users, data model, integration points, trade-offs,
unknowns, etc.)
2. Ask one question at a time: Each question should target a single decision point.
Don't bundle multiple questions together.
3. Provide your recommended answer: For every question, state what you would
recommend and why — then ask if I agree, disagree, or want to modify.
4. Explore the codebase first: If a question can be answered by reading existing
code, configs, specs, or docs, do that instead of asking me. State what you
found and the conclusion you drew.
5. Resolve dependencies in order: If decision B depends on decision A, ask about
A first. Don't skip ahead.
6. Track progress: Maintain a running outline of the design tree with decisions
marked as:
- ✅ Resolved
- ❓ Open (current question)
- ⬜ Pending
7. Be relentless: Don't accept vague answers. If I say "whatever you think is
best," push back and explain why my input matters for that specific decision.
If I give a partial answer, ask the follow-up.
8. Know when to stop: Once every branch is resolved, summarize all decisions in
a clean design document and confirm we have a shared understanding.
Each turn should follow this structure:
## Design Tree
[Updated outline with ✅/❓/⬜ markers]
## Current Question ([N] of [estimated total])
**Branch**: [which part of the design]
**Question**: [the specific question]
**My recommendation**: [what you'd suggest and why]
**Why this matters**: [what depends on this decision]
Begin by:
1. Reading any relevant files, specs, or docs related to the implementation
2. Building the initial design tree from what you understand
3. Asking the first question on the highest-priority branch
Exit criteria: All branches in the design tree are marked as resolved. No open questions remain. A summary document confirms shared understanding.
