The Cascade

What Is It

The Cascade is a development methodology for building production software with AI coding agents. It applies waterfall's sequential discipline at feature scale — each feature cascades through a fixed sequence of phases — while running multiple cascades in parallel across sprint cycles.

Traditional waterfall fails because it applies rigid sequencing to an entire project. Agile succeeds by breaking work into small increments but often loses architectural rigor. The Cascade takes the best of both: every feature gets the full waterfall treatment, but each waterfall is small enough to complete in days, not months.

Sprint N
├── Feature A: ████████░░░░░░░░░░░░  (Implement → Review)
├── Feature B: ████████████████░░░░  (Review → Compound)
├── Feature C: ██░░░░░░░░░░░░░░░░░░  (Research → Specify)
└── Bug Fix D: ████████████████████  (Implement → Done)

Core thesis: the specification is the product, and the code is a derivative.

The spec is the source of truth — not the code. If the code doesn't match the spec, the code is wrong. This inverts how most developers think about software, where code is the authoritative artifact and docs are an afterthought that drifts out of date.

With AI coding agents, this inversion becomes practical for three reasons:

AI output quality is proportional to input quality. A vague prompt produces vague code. A precise spec produces precise code. The spec is the investment.
Code is cheap to regenerate, specs are not. An agent can rewrite an implementation in minutes. The thinking behind what to build and why — the problem framing, the boundary decisions, the edge cases — that's the hard work.
Specs are human-reviewable, code volume is not. A solo developer using AI agents can generate more code in a day than they can meaningfully review. But they can review a spec. If the spec is right, the code review becomes verification against the spec rather than reasoning about intent from scratch.

The Cascade is built around this: every phase before Implementation is about getting the spec right. Implementation is mechanical — the TDD loop just translates the spec into tested code.

The Six Phases

Each feature cascades through six phases in strict order. No phase is skipped for medium+ features. Each phase produces specific artifacts and ends at a gate before the next begins.

┌──────────────┐
│  0: RESEARCH  │  Understand the landscape
│  Artifact: Research Brief
└──────┬───────┘
       │
┌──────▼───────┐
│  1: SPECIFY   │  Define WHAT and WHY
│  Artifact: Product Spec
│  Optional: Prototype → Prototype Spec
└──────┬───────┘
       │
  ◆ Human Review Gate ◆
       │
┌──────▼───────┐
│  2: DESIGN    │  Define HOW
│  Artifacts: Technical Design + ADRs + Task Breakdown
└──────┬───────┘
       │
  ◆ Human Review Gate ◆
       │
  ◆ /grill-me Exit Gate ◆
       │
┌──────▼───────┐
│  3: IMPLEMENT │  TDD loop per task (fresh session each)
│  Artifacts: Tests + Code + PLANS.md
└──────┬───────┘
       │
  ◆ Automated Verification ◆
       │
┌──────▼───────┐
│  4: REVIEW    │  Fresh-context code review
│  Artifact: Review Notes
└──────┬───────┘
       │
┌──────▼───────┐
│  5: COMPOUND  │  Capture learnings, update rules, ship
│  Artifacts: Updated rules + memory + commit
└──────────────┘

Phase 0: Research

Goal: Understand the landscape before designing anything.

Process:

Define the research question
Launch parallel research subagents (competitive, technical, prior art)
Search GitHub for existing implementations, skeleton projects, libraries
Search package registries before writing utility code
Synthesize findings into a Research Brief

Artifact: Research Brief

---
type: research-brief
date: YYYY-MM-DD
project: <project name>
status: draft
---

# Research Brief: <Feature/Product Name>

## Research Question
What are we trying to learn?

## Competitive Landscape
| Product | Approach | Strengths | Gaps |
|---------|----------|-----------|------|

## Prior Art
- Existing implementations found
- Libraries/packages that solve part of the problem
- Open source projects worth forking/adapting

## Technical Patterns
- Common architectural approaches
- Proven patterns from similar systems

## Key Insights
- What should influence our design?
- What should we avoid?

## Recommendation
Build vs. buy vs. adapt decision with rationale

When to skip: Small features, bug fixes, well-understood domains.

Agent Pipeline:

Researcher (parallel subagents) → Librarian (file + tag + link)

Multiple Researcher agents run in parallel investigating different aspects (competitive, technical, prior art)
Librarian files the Research Brief, updates activity logs, and creates wiki-links to related vault content

Phase 1: Specify

Goal: Define WHAT to build and WHY. No implementation details — just behavior, requirements, and success criteria.

Process:

Start in Plan Mode or a dedicated spec session

Use the interview pattern for medium+ features:

I want to build [brief description]. Interview me in detail.
Ask about users, use cases, edge cases, concerns, and tradeoffs.
Don't ask obvious questions — dig into the hard parts I might not have considered.
Keep interviewing until we've covered everything.
Then write a complete Product Spec.

For smaller features, write the spec directly or have Claude draft from your notes
Review the spec carefully — this is the most important document
(Optional) Build a throwaway Prototype to visualize the product or feature
- The prototype is a visual exploration tool — it demonstrates what the product could be and the functionality it could have
- No prototype code advances into Phase 2. The prototype is discarded before Design begins
- If the prototype reveals valuable insights, produce a Prototype Spec documenting what worked, what didn't, and what carries forward as requirements
- The Prototype Spec becomes supporting material for the Design phase alongside the Product Spec

Artifact: Product Spec

---
type: product-spec
date: YYYY-MM-DD
project: <project name>
status: draft | in-review | approved
tags: []
---

# Product Spec: <Feature Name>

## Problem Statement
What problem are we solving and for whom?

## User Stories
- As a [user], I want to [action], so that [value]
- Include acceptance criteria for each story

## Functional Requirements
### Phase 1 (MVP)
- Requirement with testable acceptance criteria

### Phase 2 (Enhancement)
- Requirement with testable acceptance criteria

## Non-Functional Requirements
- Performance: <specific, measurable targets>
- Security: <specific requirements>
- Scale: <expected load, data volumes>

## Boundaries
### DO NOT CHANGE
- Stable components that must not be modified
- Existing API contracts to preserve
- Database schemas to respect

### Must Ask Before Changing
- Architectural decisions that need human approval
- External service integrations

## Edge Cases
- What happens when X?
- How should the system handle Y?

## Success Metrics
- How do we know this works? (testable)

## Open Questions
- Unresolved decisions that need input

Artifact: Prototype Spec (optional — only when a prototype was built)

---
type: prototype-spec
date: YYYY-MM-DD
project: <project name>
prototype-status: discarded
---

# Prototype Spec: <Feature Name>

## Prototype Overview
What was built and what it demonstrated

## What Worked
- Functionality or UX patterns that validated well
- Interactions that felt right
- These carry forward as requirements into Design

## What Didn't Work
- Approaches that were confusing, slow, or wrong
- Anti-patterns discovered during exploration
- These are explicitly excluded from Design

## Insights
- Surprises or discoveries not anticipated in the Product Spec
- Revised assumptions based on seeing the feature in action

## Recommendations for Design
- Specific guidance for the Technical Design phase based on prototype learnings

Key principles for AI-optimized specs:

Be explicit — AI cannot infer unstated requirements
Include "DO NOT CHANGE" sections — AI will cheerfully rewrite stable code unless told not to
Use phased delivery — prevents the agent from attempting too much at once
Include examples — concrete input/output scenarios dramatically improve implementation quality
Define boundaries — what NOT to do is often more valuable than what to do

Agent Pipeline:

Product Manager (spec writing, interview) → Tech Writer (polish) → Librarian (file + tag + link)

Product Manager drives the interview pattern, writes the Product Spec with requirements, boundaries, and acceptance criteria
Tech Writer polishes the spec into a clean, professional document
Librarian files the Product Spec (and Prototype Spec if applicable), updates activity logs
Prototype building (when used) is direct human + Claude work — no agent pipeline, vibe-code style

Session management: Spec creation can consume significant context. After the spec is written, /clear and start fresh for the Design phase.

Phase 2: Design

Goal: Define HOW to build it. Architecture, tech stack, data models, API contracts, and a sequenced task breakdown.

Process:

Start a fresh session with the approved Product Spec (and Prototype Spec, if one exists)

Use Plan Mode for architecture exploration:

Read the Product Spec at <path>. Create a Technical Design document
covering architecture, data models, API contracts, and key decisions.
Then create ADRs for each significant architectural choice.
Finally, break the implementation into sequenced tasks with dependencies.

Review all three artifacts before moving to implementation

Artifact: Technical Design

---
type: technical-design
date: YYYY-MM-DD
project: <project name>
spec: "[[Product Spec Name]]"
status: draft | approved
---

# Technical Design: <Feature Name>

## Architecture Overview
High-level architecture with component diagram (Mermaid or ASCII)

## Technology Stack
- Language/framework with specific versions
- Dependencies with version constraints
- Infrastructure requirements

## Data Models
Schema definitions with types and constraints

## API Contracts
Endpoint definitions with request/response schemas

## Component Design
For each major component:
- Interface (inputs, outputs)
- Internal behavior
- Dependencies
- Error handling strategy

## Integration Points
How this connects to existing systems

## Security Considerations
Authentication, authorization, data protection

## Performance Targets
Specific, measurable benchmarks

## Testing Strategy
- Unit test approach
- Integration test approach
- What to mock, what to test against real services

Artifact: Architecture Decision Records

---
type: adr
date: YYYY-MM-DD
status: proposed | accepted | deprecated
---

# ADR-NNN: <Decision Title>

## Context
Forces at play, problem description

## Decision
The change we're making

## Consequences
What becomes easier, what becomes harder, trade-offs accepted

Artifact: Task Breakdown

---
type: task-breakdown
date: YYYY-MM-DD
spec: "[[Product Spec Name]]"
design: "[[Technical Design Name]]"
---

# Task Breakdown: <Feature Name>

## Task 1: <Title>
- **Description:** What to implement
- **Dependencies:** None | Task N
- **Files:** Files to create or modify
- **Acceptance Criteria:** Testable conditions
- **Verification:** Test command to run
- **Status:** pending | in-progress | done
- **Notes:** (updated during implementation)

## Task 2: <Title>
...

Key principles:

Tasks should be small enough for a single session (clean context)
Each task has explicit verification criteria
Dependencies are ordered — no task starts until its dependencies are verified
The "Notes" field is updated during implementation (persistent storage pattern)

Agent Pipeline:

Planner (architecture + task sequencing) → Product Manager (ADRs) → Tech Writer (design doc) → Librarian (file + tag + link) → grill-me (exit gate)

Planner drives architecture exploration, component decomposition, and task sequencing
Product Manager writes ADRs for each significant architectural choice
Tech Writer produces the polished Technical Design document
Librarian files all artifacts, updates activity logs, and cross-links to the Product Spec
Exit gate: Run /grill-me against the Technical Design and Task Breakdown. The skill interviews relentlessly about every architectural decision, dependency, and trade-off. Design is not complete until grill-me reaches a shared understanding with no open questions. See Appendix A: AI Skills for the full skill definition

Phase 3: Implement

Goal: Execute tasks one at a time using TDD. Human writes tests (or test specs), AI implements.

Process (per task):

Start a fresh session — clean context focused on this one task

Load only the relevant context:

Read the Task Breakdown at <path>. We're implementing Task N.
Read the Product Spec at <path> for requirements context.
Read the Technical Design at <path> for architecture decisions.

Write tests first (or have Claude write tests from acceptance criteria, then review):

Write tests for Task N based on the acceptance criteria.
Do NOT implement yet — just the tests. They should all fail.

Implement to pass tests:

Now implement Task N. Make all tests pass.
Run the full test suite and fix any regressions.

Update the task status and notes in the Task Breakdown
/clear before the next task

For complex tasks — use PLANS.md:

# PLANS.md

## Goal
What this implementation session achieves

## Steps
- [ ] Step 1
- [ ] Step 2
- [ ] Step 3

## Progress
(Updated as work proceeds)

## Surprises & Discoveries
(Unexpected findings during implementation)

## Decision Log
(Choices made during execution and rationale)

TDD rules:

Never let AI modify tests to make them pass
Tests must run in seconds — long test suites kill the feedback loop
One test at a time for critical logic
Review AI-written tests before trusting them as specification

Session management:

One fresh session per task or small task group
Never let a single session span unrelated tasks
Use subagents for investigation so exploration doesn't fill main context
If you've corrected the agent twice on the same issue, /clear and restart with a better prompt

Agent Pipeline:

tdd-guide (per task TDD loop)

tdd-guide enforces the write-tests-first discipline for each task
No other agents in this phase — implementation is a tight loop between the human, the agent, and the test suite

Phase 4: Review

Goal: Code review in a fresh context, free from implementation bias.

Process:

Start a fresh session (separate from the implementation session)

Point the reviewer at the changes:

Review the changes for <feature>. Check for:
- Correctness against the Product Spec at <path>
- Edge cases and error handling
- Security vulnerabilities
- Performance concerns
- Code quality and maintainability
- Consistency with existing patterns

Address findings, re-run tests

Why fresh context matters: An AI agent in a fresh session will catch issues it overlooked during implementation because it's not anchored to its own reasoning from the writing session.

For larger features: Use parallel sessions with git worktrees:

Session A: Implements
Session B: Reviews Session A's output
Session A: Addresses review feedback

Agent Pipeline:

All reviewers run in parallel (fresh context):
├── code-reviewer        (correctness, quality, patterns)
├── security-reviewer    (vulnerabilities, auth, data protection)
├── python-reviewer      (Python idioms, typing, packaging)
├── go-reviewer          (Go idioms, error handling, concurrency)
├── rust-reviewer        (ownership, lifetimes, unsafe usage)
└── cpp-reviewer         (memory safety, RAII, modern C++ practices)

All reviewers launch in parallel in a fresh session — none share context with the implementation session
Language-specific reviewers activate based on which languages are present in the changeset
Each reviewer produces findings independently; findings are merged and deduplicated before addressing

Phase 5: Compound

Goal: Capture learnings, update project context, commit and ship.

Process:

Update rules files with any new patterns discovered
Update CLAUDE.md if new conventions emerged
Save learnings to memory for future sessions
Commit with descriptive message
Update activity logs

What to capture:

New coding patterns that worked well → add to rules
Gotchas or non-obvious behaviors → add to CLAUDE.md
Workflow preferences confirmed or corrected → save to memory
Libraries or approaches that proved valuable → note for future reference

This is the compound learning loop. Each cascade makes the next one better because the project's context files grow more precise and the agent needs less correction.

Agent Pipeline:

Librarian (file + tag + link + activity logs)

Librarian files any new artifacts, updates activity logs, and ensures cross-links are current
Rule and memory updates are done directly by the human + Claude — no agent delegation for learning capture

Scaling: When to Use What

Feature Size → Process Depth

Size	Example	Phases	Duration
Trivial	Fix typo, add log line	Phase 3 only	Minutes
Small	Add endpoint, UI tweak	1 → 3 → 5	< 1 hour
Medium	New feature, API redesign	1 → 2 → 3 → 4 → 5	Hours
Large	New module, system integration	0 → 1 → 2 → 3 → 4 → 5	Days
Greenfield	New project from scratch	All phases, multiple cycles	Weeks

Decision Framework

Is it a one-line fix?
  └─ Yes → Just do it (Phase 3 only)
  └─ No → Can you describe the diff in one sentence?
       └─ Yes → Brief spec + implement (Phase 1 light + Phase 3)
       └─ No → Does it touch multiple files or systems?
            └─ No → Spec + implement + review (Phase 1 + 3 + 4)
            └─ Yes → Full cascade (Phase 0-5)

Session Architecture

Each phase gets its own session. Context is cleared between phases. The artifact from one phase becomes the input to the next — this is the cascade.

Session 1: Research (Phase 0)
  └─ Subagents do parallel research
  └─ Output: Research Brief
  └─ /clear

Session 2: Specification (Phase 1)
  └─ Interview pattern → Product Spec
  └─ Optional: Prototype → Prototype Spec
  └─ /clear

Session 3: Design (Phase 2)
  └─ Load spec → Technical Design + ADRs + Task Breakdown
  └─ /grill-me exit gate
  └─ /clear

Session 4..N: Implementation (Phase 3, one per task)
  └─ Load task + spec + design → TDD loop
  └─ /clear after each task

Session N+1: Review (Phase 4)
  └─ Fresh context → review all changes
  └─ /clear

Session N+2: Compound (Phase 5)
  └─ Capture learnings → commit → ship

Project Folder Structure

Cascade artifacts live in a versioned sdd/ folder at the project root. Each version is a complete, self-contained snapshot of the spec-driven design at that point in time.

<project-root>/
└── sdd/
    ├── v1/
    │   ├── research/    ← Research Briefs
    │   ├── specs/       ← Product Specs, Prototype Specs, Task Breakdowns
    │   └── docs/        ← Technical Designs, ADRs
    │       └── adr/
    ├── v2/
    │   ├── research/
    │   ├── specs/
    │   └── docs/
    │       └── adr/
    └── ...

Versioning rule: copy forward. When a new version is cut, every artifact from the prior version is copied into the new version folder as the baseline. Changes happen in the new version; prior versions are frozen.

This preserves a clean, navigable history of how the design evolved. Reading any vX/ folder gives you the complete picture of the system as it was specified at that version — no need to reconstruct state by tracing diffs across folders.

When to bump versions:

Major product release or milestone
Significant scope change that invalidates prior design assumptions
A new "phase" of the product where prior specs no longer reflect intent

Trivial edits and corrections happen in place within the current version.

Artifact Map

All Artifacts by Phase

Phase	Artifact	Format	Location	Consumer
0 Research	Research Brief	Markdown	`sdd/vX/research/`	Human + AI
1 Specify	Product Spec	Markdown	`sdd/vX/specs/`	Human + AI
1 Specify	Prototype Spec (optional)	Markdown	`sdd/vX/specs/`	Human + AI
2 Design	Technical Design	Markdown	`sdd/vX/docs/`	Human + AI
2 Design	ADRs	Markdown	`sdd/vX/docs/adr/`	Human + AI
2 Design	Task Breakdown	Markdown	`sdd/vX/specs/`	AI (primary)
3 Implement	Tests	Code	Project test directory	AI + CI
3 Implement	Code	Code	Project source directory	AI + Human
3 Implement	PLANS.md	Markdown	Project root (temporary)	AI
4 Review	Review Notes	In-session	Ephemeral (or saved)	Human
5 Compound	Updated Rules	Markdown	`.claude/rules/`	AI
5 Compound	Updated CLAUDE.md	Markdown	Project root	AI
5 Compound	Memory entries	Markdown	`.claude/projects/*/memory/`	AI

Artifact Flow

Research Brief ─────► Product Spec ──────┬──► Technical Design
                      Prototype Spec ────┘         │
                           │                       ├──► ADRs
                           │                       │
                           └───────────────────────┴──► Task Breakdown
                                                            │
                                                      /grill-me gate
                                                            │
                                                      ┌─────┼─────┐
                                                      │     │     │
                                                   Task 1 Task 2 Task N
                                                      │     │     │
                                                    Tests  Tests  Tests
                                                      │     │     │
                                                    Code   Code   Code
                                                      │     │     │
                                                      └─────┼─────┘
                                                            │
                                                         Review
                                                            │
                                                      ┌─────┼─────┐
                                                      │     │     │
                                                    Rules Memory CLAUDE.md
                                                   (compound learning loop)

Hook Integration

Hook	Phase	Purpose
PreToolUse:Edit on test files	3 Implement	Block AI from modifying tests to pass them
PostToolUse:Edit	3 Implement	Auto-lint after every file edit
PostToolUse:Write	3 Implement	Auto-format new files
Stop	5 Compound	Extract learnings before session ends

Parallel Execution Patterns

Pattern	Phase	How
Parallel research	0 Research	Multiple subagents investigate different aspects simultaneously
Fan-out migration	3 Implement (batch)	`claude -p "Migrate $file"` in parallel across files
Writer/Reviewer	3-4	Implementation + review in separate worktrees
Multi-perspective review	4 Review	Security + code + language-specific reviewers in parallel

Principles

Seven principles underpin The Cascade. These emerged independently across every production-grade AI development methodology:

#	Principle	Why It Matters
1	Specify before implementing	AI output quality is directly proportional to input quality.
2	Structure context deliberately	Context window is the #1 constraint. Performance degrades as it fills.
3	Work in small increments	One function, one feature, one test at a time. Prevents compounding errors.
4	Maintain human review gates	AI generates, humans validate.
5	Automate verification	Tests, linters, CI/CD as safety nets. The single highest-leverage practice.
6	Document for the AI, not just humans	Explicit, example-driven, boundary-aware specs. AI cannot infer unstated requirements.
7	Feed learnings back	Compound knowledge over time. Rules, memory, and learnings create a virtuous cycle.

Appendix A: AI Skills

Skills are reusable prompts invoked via slash commands during a cascade. They are distinct from agents — an agent is a persona with capabilities and routing rules, while a skill is a structured prompt that can be run by any agent or directly by the human.

`/grill-me` — Design Interview

Used as: Exit gate for Phase 2 (Design)

Purpose: Relentlessly interview every aspect of the implementation until all design decisions are resolved. Catches gaps, vague thinking, and unstated assumptions before code moves to Review.

Prompt:

---

name: grill-me

description: Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".

---

You are a rigorous design interviewer. Your job is to walk down every branch of the
design tree for the plan or idea I've described, resolving dependencies between
decisions one by one.

How to Conduct the Interview:

1. Identify the design tree: Break the plan into its major decision branches
   (architecture, scope, users, data model, integration points, trade-offs,
   unknowns, etc.)

2. Ask one question at a time: Each question should target a single decision point.
   Don't bundle multiple questions together.

3. Provide your recommended answer: For every question, state what you would
   recommend and why — then ask if I agree, disagree, or want to modify.

4. Explore the codebase first: If a question can be answered by reading existing
   code, configs, specs, or docs, do that instead of asking me. State what you
   found and the conclusion you drew.

5. Resolve dependencies in order: If decision B depends on decision A, ask about
   A first. Don't skip ahead.

6. Track progress: Maintain a running outline of the design tree with decisions
   marked as:
   - ✅ Resolved
   - ❓ Open (current question)
   - ⬜ Pending

7. Be relentless: Don't accept vague answers. If I say "whatever you think is
   best," push back and explain why my input matters for that specific decision.
   If I give a partial answer, ask the follow-up.

8. Know when to stop: Once every branch is resolved, summarize all decisions in
   a clean design document and confirm we have a shared understanding.

Each turn should follow this structure:

  ## Design Tree
  [Updated outline with ✅/❓/⬜ markers]

  ## Current Question ([N] of [estimated total])
  **Branch**: [which part of the design]
  **Question**: [the specific question]
  **My recommendation**: [what you'd suggest and why]
  **Why this matters**: [what depends on this decision]

Begin by:
1. Reading any relevant files, specs, or docs related to the implementation
2. Building the initial design tree from what you understand
3. Asking the first question on the highest-priority branch

Exit criteria: All branches in the design tree are marked as resolved. No open questions remain. A summary document confirms shared understanding.

The Cascade

What Is It

The Six Phases

Phase 0: Research

Phase 1: Specify

Phase 2: Design

Phase 3: Implement

Phase 4: Review

Phase 5: Compound

Scaling: When to Use What

Feature Size → Process Depth

Decision Framework

Session Architecture

Project Folder Structure

Artifact Map

All Artifacts by Phase

Artifact Flow

Hook Integration

Parallel Execution Patterns

Principles

Appendix A: AI Skills

/grill-me — Design Interview

`/grill-me` — Design Interview