Table of Content:
- Step 1: Setup Environment
- Step 2: Prompt History Records (PHR) - Automatic Knowledge Capture
- Step 3: Create the Project Rulebook - The Constitution!
- Step 4: Specify the Feature (The "What" and "Why")
- Step 5: Define the Technical Plan (The "How")
- Step 6: Architecture Decision Records (ADR) - Post-Planning Review
- Step 7: Generate the Actionable Task List
- Step 8: Implement, Test, and Validate
- Step 9: Clarify & Analyze Spec Deep Dive
What is Spec-Driven Development?
Instead of coding first and writing docs later, in spec-driven development, you start with a (you guessed it) spec. This is a contract for how your code should behave and becomes the source of truth your tools and AI agents use to generate, test, and validate code. The result is less guesswork, fewer surprises, and higher-quality code.
In 2025, this matters because:
- AI IDEs and agent SDKs can turn ambiguous prompts into a lot of code quickly. Without a spec, you just get elegant garbage faster.
- Agent platforms (e.g., OpenAI Agents SDK) make multi-tool, multi-agent orchestration cheap—but the cost of weak specifications is amplified at scale.
- The broader ecosystem (e.g., GitHub’s recent “spec-driven” tooling push) is converging on spec-first workflows for AI software.
Why it beats “vibe coding”
- Captures decisions in a reviewable artifact instead of buried chat threads.
- Speeds onboarding and cross-team collaboration.
- Reduces rework and drift because tests/examples anchor behavior.
Tools & patterns mentioned/adjacent in the ecosystem
- Spec-Kit Plus (Panaversity open-source toolkit)
- Spec-Kit (GitHub’s open-source toolkit) — templates and helpers for running an SDD loop with your AI tool of choice.
- Broader coverage in recent articles summarizing SDD’s rise and best practices.
How Spec-Kit Plus Works: Automatic Documentation + Explicit Decision Points
Spec-Kit Plus extends GitHub's Spec Kit with two key innovations:
1. Automatic Prompt History Records (PHR)
Every significant AI interaction is automatically captured as a structured artifact—no extra commands needed. You work normally, and get complete documentation of your AI-assisted development journey.
What gets captured automatically:
/sp.constitutioncommands → PHR created/sp.specifycommands → PHR created/sp.plancommands → PHR created + ADR suggestion/sp.taskscommands → PHR created/implementcommands → PHR created- Debugging, refactoring, explanations → PHRs created
You see: Brief confirmation like
📝 PHR-0003 recorded
2. Explicit Architecture Decision Records (ADR)
After planning completes, you get a suggestion to review for architectural
decisions. You explicitly run /adr when ready to capture
significant technical choices.
Flow:
/plan completes
↓
📋 "Review for architectural decisions? Run /adr"
↓
(You run /adr when ready)
↓
ADRs created in docs/adr/ (if decisions are significant)
Why explicit? Architectural decisions require careful judgment, team discussion, and review of existing patterns. You control when this happens.
Quick Reference: Commands & Automation
| Command | What It Does | PHR Created? | ADR Created? |
|---|---|---|---|
/sp.constitution | Define project principles | ✅ Automatic | ❌ No |
/sp.specify | Write feature spec | ✅ Automatic | ❌ No |
/sp.plan | Design architecture | ✅ Automatic | 📋 Suggestion only |
/sp.adr | Review architectural decisions | ❌ No* | ✅ Explicit |
/sp.tasks | Break down implementation | ✅ Automatic | ❌ No |
/sp.implement | Execute TDD cycle | ✅ Automatic | ❌ No |
| Debugging | Fix errors | ✅ Automatic | ❌ No |
| Refactoring | Clean up code | ✅ Automatic | ❌ No |
/sp.phr (manual) | Override automatic PHR | ✅ Explicit | ❌ No |
* The /adr command itself doesn't create a PHR, but the
planning session before it does
Ready to build muscle memory for spec-driven development? Start Shipping! 🚀
Note: Use
specifyplusorspcommands.
Official Spec Kit Plus resources
- Spec Kit Plus GitHub repository — enhanced templates, scripts, and CLI
- PyPI package — install
with
pip install specifyplus
Step 1: Setup Environment
Goal: bring your workstation to a known-good baseline so Spec Kit Plus and the SDD loop run without friction.
Inputs
- Git installed and configured with your preferred editor
-
Python 3.10+ or the latest
Astral
uvruntime (used byuvx) -
Setup any coding agent of your choice (Qwen Code, Gemini CLI, Claude Code,
Cursor, GitHub Copilot, Roo, etc.)
Actions
Quick start with SpecifyPlus CLI
Install SpecifyPlus (persistent option recommended)
# From PyPI (recommended) pip install specifyplus # or with uv tools uv tool install specifyplusAlternative (one-off):
uvx specifyplus --help uvx specifyplus init <PROJECT_NAME> # or uvx sp init <PROJECT_NAME>Run the readiness checks
specifyplus --help # or sp --help specifyplus check # or sp checkBootstrap your project
specifyplus init <PROJECT_NAME> # orsp init <PROJECT_NAME>Follow the slash-command sequence inside your coding agent (Copilot, Claude Code, Cursor, Gemini CLI, etc.).
Inspect the generated .github/ and
.specify/ folders, then delete the sandbox once you understand
the layout.
Slash commands (Spec Kit 2025)
| Command | Purpose |
|---|---|
/sp.constitution | Create or update project principles and guardrails. |
/sp.specify | Capture the “what” and “why” of the feature or product. |
/sp.clarify |
Resolve ambiguities before planning; must run before
/plan unless explicitly skipped.
|
/sp.plan | Produce the technical approach, stack choices, and quickstart. |
/sp.adr | Record Architecture Decision Records. |
/sp.tasks | Break the plan into actionable units of work. |
/sp.analyze |
Check cross-artifact coverage and highlight gaps after
/tasks.
|
/sp.implement | Execute tasks in sequence with automated guardrails. |
/sp.phr | Create prompt history record for the prompt. |
Deliverables
- A fresh repository ready for Spec Kit
-
Verified
uvxrunner capable of invokingspecifyplus
Ready to build muscle memory for spec-driven development? Start Shipping! 🚀
Step 2: Prompt History Records (PHR) - Automatic Knowledge Capture
Built into Spec Kit: Every AI exchange is automatically captured as a structured artifact—no extra commands needed.
The Problem: Lost Knowledge
Every day, developers have hundreds of AI conversations that produce valuable code, insights, and decisions. But this knowledge disappears into chat history, leaving you to:
- Reinvent solutions you already figured out
- Debug without context of why code was written that way
- Miss patterns in what prompts actually work
- Lose traceability for compliance and code reviews
The Solution: Automatic Prompt History Records
PHRs are created automatically after every significant AI interaction in Spec Kit. No extra commands, no manual effort—just work normally and get complete documentation of your AI-assisted development journey.
Core Learning Science Principles
| Principle | How PHRs Apply | Daily Benefit |
|---|---|---|
| Spaced Repetition | Revisit PHRs weekly to reinforce successful strategies | Build muscle memory for effective prompting |
| Metacognition | Reflect on what worked/didn't work in each exchange | Develop better prompting intuition |
| Retrieval Practice | Search PHRs when facing similar problems | Access proven solutions instantly |
| Interleaving | Mix different types of prompts (architect/red/green) | Strengthen transfer across contexts |
How It Works: Completely Automatic
Setup (One Time)
When you run specify init:
specify init --aigemini # Or claude, cursor, copilot, etc.
You automatically get:
- ✅ Implicit PHR creation built into every command
- ✅ PHR templates and scripts
- ✅ Deterministic location logic (pre-feature vs feature-specific)
-
✅ Manual
/phrcommand for custom cases (optional)
Daily Usage: Just Work Normally
PHRs are created automatically after:
/sp.constitution Define quality standards → PHR created in docs/prompts/
/sp.specify Create authentication feature → PHR created in docs/prompts/
/sp.plan Design JWT system → PHR created in specs/001-auth/prompts/
/sp.tasks Break down implementation → PHR created in specs/001-auth/prompts/
/sp.implement Write JWT token generation → PHR created in specs/001-auth/prompts/
Also after general work:
- Technical questions producing code → PHR created
- Debugging or fixing errors → PHR created
- Code explanations → PHR created
- Refactoring → PHR created
You see: Brief confirmation like
📝 PHR-0003 recorded
That's it! Keep working, and your knowledge compounds automatically.
Deterministic PHR Location Strategy
PHRs use a simple, deterministic rule for where they're stored:
Before Feature Exists (Pre-Feature Work)
Location:docs/prompts/
Stages:constitution, spec
Naming:0001-title.constitution.prompt.md
Use cases:
- Creating constitution.md
- Writing initial specs
Example:
docs/
└── prompts/
├── 0001-define-quality-standards.constitution.prompt.md
└── 0002-create-auth-spec.spec.prompt.md
Note: The general stage can also fall back to
docs/prompts/ if no specs/ directory exists, but
will show a warning suggesting to use constitution or
spec stages instead, or create a feature first.
After Feature Exists (Feature Work)
Location:specs/<feature>/prompts/
Stages:architect, red,
green, refactor, explainer,
misc, general
Naming:0001-title.architect.prompt.md
Use cases:
- Feature planning and design
- Implementation work
- Debugging and fixes
- Code refactoring
- General feature work
Example:
specs/
├── 001-authentication/
│ ├── spec.md
│ ├── plan.md
│ └── prompts/
│ ├── 0001-design-jwt-system.architect.prompt.md
│ ├── 0002-implement-jwt.green.prompt.md
│ ├── 0003-fix-token-bug.red.prompt.md
│ └── 0004-setup-docs.general.prompt.md
└── 002-database/
└── prompts/
├── 0001-design-schema.architect.prompt.md
└── 0002-optimize-queries.refactor.prompt.md
Key Features
- Local sequence numbering: Each directory starts at 0001
- Stage-based extensions: Files show their type
(
.architect.prompt.md,.red.prompt.md) - Auto-detection: Script finds the right feature from branch name or latest numbered feature
- Clear location rules:
constitution,spec→ alwaysdocs/prompts/- Feature stages →
specs/<feature>/prompts/ general→ feature context if available, elsedocs/prompts/with warning
PHR Stages
Pre-Feature Stages
| Stage | Extension | When to Use | Example |
|---|---|---|---|
constitution | .constitution.prompt.md | Defining quality standards, project principles | Creating constitution.md |
spec | .spec.prompt.md | Creating business requirements, feature specs | Writing spec.md |
Feature-Specific Stages (TDD Cycle)
| Stage | Extension | TDD Phase | When to Use | Example |
|---|---|---|---|---|
architect | .architect.prompt.md | Plan | Design, planning, API contracts | Designing JWT auth system |
red | .red.prompt.md | Red | Debugging, fixing errors, test failures | Fixing token expiration bug |
green | .green.prompt.md | Green | Implementation, new features, passing tests | Implementing login endpoint |
refactor | .refactor.prompt.md | Refactor | Code cleanup, optimization | Extracting auth middleware |
explainer | .explainer.prompt.md | Understand | Code explanations, documentation | Understanding JWT flow |
misc | .misc.prompt.md | Other | Uncategorized feature work | General feature questions |
general | .general.prompt.md | Any | General work within feature context | Setup, docs, general tasks |
Note:general stage behavior:
-
If
specs/exists: goes tospecs/<feature>/prompts/ -
If no
specs/: falls back todocs/prompts/with warning
What Happens Behind the Scenes
Automatic PHR Creation Flow
When you run any significant command:
- You execute work -
/constitution,/specify,/plan, debugging, etc. - AI completes the task - Creates files, writes code, fixes bugs
- Stage auto-detected - System determines: architect, green, red, refactor, etc.
- PHR auto-created - File generated with proper naming and location
- Brief confirmation - You see:
📝 PHR-0003 recorded
All metadata captured automatically:
- Full user prompt (complete multiline text)
- Response summary
- Files modified
- Tests run
- Stage and feature context
- Timestamps and user info
Integrated SDD Workflow
/constitution → PHR created (docs/prompts/)
↓
/specify → PHR created (docs/prompts/)
↓
/plan → PHR created (specs/<feature>/prompts/) + ADR suggestion
↓
/tasks → PHR created (specs/<feature>/prompts/)
↓
/implement → PHR created (specs/<feature>/prompts/)
↓
Debug/fix → PHR created (specs/<feature>/prompts/)
↓
Refactor → PHR created (specs/<feature>/prompts/)
PHRs compound throughout the entire workflow—automatically.
Manual Override (Optional)
You can still use /phr explicitly for:
- Custom metadata and labels
- Specific stage override
- Detailed context and links
- Work that wasn't automatically captured
/phr Define API versioning standards # Explicit creation with full control
But 95% of the time, you won't need to—PHRs just happen!
Daily Workflow with Automatic PHRs
Morning: Context Loading (2 minutes)
# Read yesterday's PHRs to rehydrate context
ls specs//prompts/.prompt.md | tail-5 | xargs cat
# Or for pre-feature work
ls docs/prompts/*.prompt.md | tail-5 | xargs cat
During Work: Just Work (PHRs Happen Automatically)
# You work normally:
/plan Design JWT authentication system# → AI creates plan.md# → PHR automatically created: specs/001-auth/prompts/0001-design-jwt-system.architect.prompt.md# → You see: 📝 PHR-0001 recorded
/implement Create token generation function# → AI implements the code# → PHR automatically created: specs/001-auth/prompts/0002-implement-token-gen.green.prompt.md# → You see: 📝 PHR-0002 recorded# Debug something:
Fix token expiration bug
# → AI fixes the bug# → PHR automatically created: specs/001-auth/prompts/0003-fix-expiration-bug.red.prompt.md# → You see: 📝 PHR-0003 recorded
No /phr commands needed! Every significant
interaction is captured automatically.
Evening: Reflect & Learn (3 minutes)
# Review today's PHRs
grep -r "Reflection:" specs/*/prompts/ | tail -3# Find patterns in successful prompts
grep -r "✅ Impact:" specs/*/prompts/ | grep -v "recorded for traceability"# Count today's PHRs
find specs -name "*.prompt.md" -mtime -1| wc -l
What Each PHR Contains
---id:0001title: Design JWT authentication system
stage: architect
date:2025-10-01surface: agent
model: gpt-4feature:001-authentication
branch: feat/001-authentication
user: Jane Developer
command: phr
labels: ["auth", "security", "jwt"]
links: spec: specs/001-authentication/spec.md
ticket:null adr: docs/adr/0003-jwt-choice.md
pr:nullfiles: - src/auth/jwt.py
- src/auth/middleware.py
- tests/test_jwt.py
tests: - tests/test_jwt.py::test_token_generation
---## Prompt
Design a JWT authentication system with token generation, validation, and refresh capabilities.
## Response snapshot
Created JWT auth system with:
- Token generation with 15-minute expiration
- Refresh token with 7-day expiration
- Middleware for route protection
- Comprehensive test coverage
## Outcome- ✅ Impact: Complete JWT auth system designed and implemented
- 🧪 Tests: tests/test_jwt.py::test_token_generation (passing)
- 📁 Files: src/auth/jwt.py, src/auth/middleware.py, tests/test_jwt.py
- 🔁 Next prompts: Implement refresh token rotation, add rate limiting
- 🧠 Reflection: JWT implementation was straightforward; consider adding refresh token rotation for better security
Searching Your PHR Knowledge Base
Find by Topic
# Find all authentication-related prompts
grep -r "auth" specs/*/prompts/# Find all prompts about databases
grep -r "database\|sql\|postgres" specs/*/prompts/
Find by Stage
# Find all debugging sessions (red stage)
find specs -name "*.red.prompt.md"
# Find all architecture planning (architect stage)
find specs -name "*.architect.prompt.md"
# Find all implementations (green stage)
find specs -name "*.green.prompt.md"
Find by File
# Find prompts that touched specific files
grep -r "auth.py" specs/*/prompts/# Find prompts that ran specific tests
grep -r "test_login" specs/*/prompts/
Find by Feature
# List all PHRs for a specific feature
ls -la specs/001-authentication/prompts/
# Count PHRs per featurefor dir in specs/*/prompts; do echo "$dir: $(ls "$dir" | wc -l)"; done
Advanced Usage
Team Knowledge Sharing
# Commit PHRs with your code
git add specs/*/prompts/ && git commit -m "Add PHR: JWT authentication implementation"# Review team's PHRs
git log --all --grep="PHR:" --oneline
# Create team prompt library
mkdir .docs/team-prompts
cp specs//prompts/.architect.prompt.md .docs/team-prompts/
Compliance & Auditing
# Generate audit trail for security work
find specs -name "*.prompt.md" -execgrep -l "security\|auth\|payment" {} \;
# Track when decisions were madegrep -r "date:" specs//prompts/ | grep"2025-10"# Find who worked on whatgrep -r "user:" specs//prompts/ | sort | uniq
Performance Optimization
# Find your most effective promptsgrep -r "✅ Impact:" specs//prompts/ | grep -v "recorded for traceability"# Identify patterns in failed attemptsgrep -r "❌" specs//prompts/# Track time-to-solutiongrep -r "Next prompts:" specs/*/prompts/ | grep -v "none"
Integration with SDD Components
PHRs Link to Everything
links: spec: specs/001-auth/spec.md # Feature spec adr: docs/adr/0003-jwt-choice.md # Architectural decision ticket: JIRA-123# Issue tracker pr: https://github.com/org/repo/pull/45 # Pull request
Workflow Integration
1. /constitution → docs/prompts/0001-quality-standards.constitution.prompt.md2. /specify → docs/prompts/0002-auth-requirements.spec.prompt.md3. /plan → specs/001-auth/prompts/0001-design-system.architect.prompt.md4. /adr → (ADR references the PHR for context)
5. /tasks → specs/001-auth/prompts/0002-break-down-tasks.architect.prompt.md6. /implement → specs/001-auth/prompts/0003-implement-jwt.green.prompt.md7. Debug & fix → specs/001-auth/prompts/0004-fix-token-bug.red.prompt.md8. Refactor → specs/001-auth/prompts/0005-extract-middleware.refactor.prompt.md
Why This Works (Learning Science)
Spaced Repetition
- Weekly PHR reviews reinforce successful prompting patterns
- Searching past PHRs when facing similar problems builds retrieval strength
- Pattern recognition emerges from reviewing your own prompt history
Metacognition
- Reflection prompts in each PHR force you to think about what worked
- "Next prompts" section helps you plan follow-up actions
- Outcome tracking shows the connection between prompts and results
Interleaving
- Stage tagging (architect/red/green) mixes different types of thinking
- Context switching between planning, coding, and debugging strengthens transfer
- Cross-domain learning happens when you apply patterns from one area to another
Retrieval Practice
- Searching PHRs forces active recall of past solutions
- Weekly reviews strengthen memory consolidation
- Reapplying patterns to new problems deepens understanding
Success Metrics
After 1 week of using PHRs, you should have:
- [ ] 20+ PHRs capturing your AI interactions
- [ ] 3+ successful prompt patterns you can reuse
- [ ] 1+ debugging session where PHRs saved you time
- [ ] Clear understanding of what prompts work for your domain
After 1 month:
- [ ] 100+ PHRs organized by feature
- [ ] Searchable knowledge base of effective prompts
- [ ] Measurable reduction in time spent solving similar problems
- [ ] Team members using each other's PHRs as templates
The goal: Turn AI assistance from ad-hoc to systematic, building a compounding knowledge base that makes you more effective every day.
Troubleshooting
Common Issues
"Feature stage 'architect' requires specs/ directory and feature context"
- Cause: Using feature stage (
architect,red,green, etc.) before specs/ directory exists - Solution: Use pre-feature stages
(
constitution,spec) or create a feature first with/specify
"No feature specified and no numbered features found"
- Cause: Working in feature context but no features exist
- Solution: Run
/specifyto create your first feature, or specify--featuremanually
"Feature directory not found"
- Cause: Specified feature doesn't exist in
specs/ - Solution: Check available features with
ls specs/or create the feature with/specify
"Warning: No specs/ directory found. Using docs/prompts/ for general stage."
- Cause: Using
generalstage when no specs/ directory exists - Not an error: PHR will be created in
docs/prompts/as fallback - Suggestion: Consider using
constitutionorspecstages for pre-feature work, or create a feature first
Manual PHR Creation
If needed, you can create PHRs manually:
# Pre-feature PHR (constitution or spec)scripts/bash/create-phr.sh \
--title"Define API standards" \
--stageconstitution \
--json# Feature-specific PHR (requires specs/ and feature context)scripts/bash/create-phr.sh \
--title"Implement login" \
--stagegreen \
--feature"001-auth" \
--json# General stage (falls back to docs/prompts/ if no specs/)scripts/bash/create-phr.sh \
--title"Setup CI pipeline" \
--stagegeneral \
--json
Note: The script only creates the file with placeholders. The
AI agent must fill all {{PLACEHOLDERS}} after creation.
Comparison: PHR vs Traditional Methods
| Aspect | Traditional (Chat History) | PHR System |
|---|---|---|
| Persistence | Lost when chat closes | Permanent, version-controlled |
| Searchability | Limited to current session | grep, find, full-text search |
| Organization | Chronological only | By feature, stage, file, label |
| Team Sharing | Screenshots, copy-paste | Git commits, pull requests |
| Traceability | None | Links to specs, ADRs, PRs |
| Learning | No reinforcement | Spaced repetition, retrieval practice |
| Compliance | No audit trail | Complete history with metadata |
Summary: PHRs are Automatic
PHRs are built into Spec Kit with automatic creation:
✅ Completely automatic: Created after every significant
command—no extra work
✅ Deterministic location:
-
Pre-feature (
constitution,spec) →docs/prompts/ - Feature work →
specs/<feature>/prompts/ - Clear file naming with stage extensions
✅ Full metadata capture: Prompts, responses, files, tests,
timestamps
✅ Stage-based organization: architect, red,
green, refactor, explainer, etc.
✅ Learning-focused:
Based on spaced repetition and retrieval practice
✅
Team-friendly: Version-controlled, searchable, shareable
✅
Compliance-ready: Complete audit trail with no manual effort
Start using PHRs today by running
specify init and working normally. Every AI interaction is
automatically captured, documented, and searchable. Your future self (and your
team) will thank you! 🚀
Key Takeaway
You don't need to think about PHRs—they just happen.
Work normally with Spec Kit commands, and get automatic documentation of your entire AI-assisted development journey.
Step 3: Create the Project Rulebook - The Constitution!
Goal: document the non-negotiable principles that every spec, plan, and task must honor.
Purpose: What is a Constitution?
Imagine you and your computer helper are a team building a giant LEGO castle. Before you start, you need to agree on some rules so you don't mess things up.
- What if you want all the towers to be square, but your helper starts building round ones?
- What if you decide the roof must be blue, but your helper builds a red one?
That would be a mess!
The Constitution is your team's Rulebook. It lists the most important rules that both you and your computer helper MUST follow, no matter what. It makes sure you both build the project the exact same way, every single time.
Best Practices: What Rules Go in the Rulebook?
Your rulebook shouldn't be a thousand pages long. It should only have the most important, "non-negotiable" rules. Think about these questions:
How will we know it works?
- Good Rule: "Every part we build must have a special check (a test) to make sure it's not broken."
- Bad Rule: "Try to make it good." (This is too vague!)
What should it look like?
- Good Rule: "We will always use bright, happy colors and the 'Comic Sans' font."
- Good Rule: "We will build it with these special 'NextJS' LEGO bricks."
How will we keep it safe?
- Good Rule: "We will never write down secret passwords inside our project."
How will we work together?
- Good Rule: "We will only build in small pieces at a time, not one giant chunk."
You write these rules down so your computer helper can read them and never forget them. It helps the AI build exactly what you want, in the style you want, and keeps your project strong and safe.
Your Hands-On Plan
Setup a new project sp init hello_spp and follow along:
Ask your helper to write the first draft.
-
In your agent chat, running
/sp.constitutionis like asking your helper, "Can you start a new rulebook for us?"
-
In your agent chat, running
Tell your helper what kind of project you're making.
-
When you run the prompt:
/sp.constitution Fill the constitution with the bare minimum requirements for a static web app based on the template. We use NextJS15..., you're telling the AI, "Okay, the game we're playing is 'Website Building'. Let's write down the basic rules for that game, using these specific LEGO pieces."
-
When you run the prompt:
Be the Boss: Check the Rules.
-
Your computer helper is smart, but you're the boss. Open the
constitution.mdfile and read the rules it wrote. Do they make sense? Is anything missing? You can change, add, or remove any rule you want.
-
Your computer helper is smart, but you're the boss. Open the
Save Your Rulebook.
- "Committing the first version as v1.0" is like taking a picture of your finished rulebook and labeling it "Version 1." This way, your whole team knows which rules to follow, and you can always look back to see how the rules have changed over time.
Inputs
- The generated
.specify/memory/constitution.md - Any existing engineering guardrails (testing policy, security requirements, coding standards)
- Stakeholder alignment on mandatory practices
Actions
In your agent chat, run
/sp.constitutionto generate or update the baseline document.Now we have to update the constitution.md file. You can use AI Agent Tool; Prompt:
/sp.constitution Fill the constitution withthe bare minimum requirements fora static web app based onthetemplate. WeuseNextJS15React19, TailwindCSSandShadCNcomponents.Open
.specify/memory/constitution.mdand review generated rules.- Commit the first version as
v1.0.
Deliverables
- Canonical constitution stored in Git and referenced by every downstream artifact
Common Pitfalls
- Writing vague aspirations (“write clean code”) instead of enforceable rules
- Allowing the constitution to drift from reality—review it alongside major releases
- Leaving the file outside version control (loses traceability)
References
- Spec Kit Plus repo: https://github.com/panaversity/spec-kit-plus
- PyPI package: https://pypi.org/project/specifyplus/
- Original GitHub Spec Kit repo: https://github.com/github/spec-kit
- Microsoft Dev Blog (Spec Kit intro): https://developer.microsoft.com/blog/spec-driven-development-spec-kit
- The ONLY guide you'll need for GitHub Spec Kit: https://www.youtube.com/watch?v=a9eR1xsfvHg
Step 4: Specify the Feature (The "What" and "Why")
Goal: Translate a high-level user need into a detailed, unambiguous, and reviewable specification. This artifact becomes the shared source of truth for the feature before any technical planning begins.
Inputs
- The approved
constitution.mdfile. - A clear idea of the user problem you are solving (your "product intent brief").
-
The
/specifyslash command, which is now available in your agent's chat.
The /specify Command
This command transforms a simple feature description (the user-prompt) into a complete, structured specification with automatic repository management:
- Automatic Feature Numbering: Scans existing specs to determine the next feature number (e.g., 001, 002, 003)
- Branch Creation: Generates a semantic branch name from your description and creates it automatically
- Template-Based Generation: Copies and customizes the feature specification template with your requirements
- Directory Structure: Creates the proper specs/[branch-name]/ structure for all related documents
Actions
Setup a new project sp init hello_spp, constitution and follow
along:
Craft and Run the Specify Prompt: In your agent chat, run the
/specifycommand. Provide a clear, user-focused description of the feature. Crucially, include a reference to yourconstitution.mdfile to ensure the AI's output adheres to your project's rules.- Your Prompt Example (Perfect):
/specify @constitution.md I am building a modern podcast website. I want itto look sleek, something that would stand out. Should have a landing page withone featured episode, an about page, anda FAQ page. Should have 20 episodes, andthe data is mocked - you donot need to pull anything fromany real feed.
- Your Prompt Example (Perfect):
Observe the Agent's Automated Scaffolding: The AI agent will now execute its
specifyscript. It will perform several actions automatically:-
It creates a new, isolated Git branch for the feature (e.g.,
001-i-am-building). -
It generates a new folder inside
specs/for all of this feature's artifacts. -
It creates a
spec/.mdfile inside that folder, populating it based on your prompt and a template. - It performs an initial validation of the spec against your constitution.
-
You must Verify and Approve the Generated .md file. You can further
iterate manualy or using your AI Companion as well.
For things you need clarification use the best guess you think is reasonable. Update acceptance checklist after.
-
It creates a new, isolated Git branch for the feature (e.g.,
Human Review & Clarification Loop (The Most Important Part): The agent has provided a first draft. Now, your role as the developer is to refine it into a final, complete specification.
- Open the newly generated
spec.mdfile. - Resolve Ambiguities: Search the document for any
[NEEDS CLARIFICATION]markers. For each one, have a conversation with your agent to resolve it.- Example Prompt:
In @spec.md, it asks about episode ordering. Let's make it reverse chronological (newest first). Please update the document and remove the clarification marker.
- Example Prompt:
- Tighten Scope: Review the generated user stories and
functional requirements. Are they accurate? Add explicit non-goals to
prevent scope creep.
- Example Prompt:
In @spec.md, under non-goals, please add that this feature will not include user comments or real-time playback analytics.
- Example Prompt:
- Ensure Testability: Read the Acceptance Scenarios. Are
they clear, measurable, and written in a way that can be easily turned
into automated tests (like Given/When/Then)?
- Example Prompt:
The acceptance scenario for the landing page is good, but please add a new scenario: "Given I am on the landing page, When I click the 'View All Episodes' button, Then I am taken to the '/episodes' page."
- Example Prompt:
- Open the newly generated
Version and Commit the Spec: Once the
spec.mdis clear, complete, and agreed upon, update its status.-
Manually edit the
spec.mdheader fromStatus: DrafttoStatus: Ready for Planning. - Commit the finalized
spec.mdto its feature branch.
-
Manually edit the
Deliverables
- A new Git branch dedicated to the feature.
-
A final, reviewed, and committed
spec.mdfile that serves as the unambiguous source of truth for what you are building.
Next step: run the /plan command when ready.
Quality Gates ✅
- ✅ All
[NEEDS CLARIFICATION]markers have been resolved. - ✅ The spec includes clear, testable Acceptance Scenarios for all primary user flows.
-
✅ The spec aligns with the project rules defined in
constitution.md. - ✅ (For teams) The spec has been reviewed and approved by relevant stakeholders (e.g., in a Pull Request).
Common Pitfalls
- Moving to the
/planstep too early, before all ambiguities in the spec are resolved. This is the #1 mistake to avoid. -
Writing a
/specifyprompt that describes how to build it, not what to build (e.g., "create a React component" vs. "show the user a list of episodes"). -
Forgetting to
@mentionthe constitution in your prompt, which can lead to the AI ignoring your core rules.
References
- Spec Kit Plus repo: https://github.com/panaversity/spec-kit-plus
- PyPI package: https://pypi.org/project/specifyplus/
- Original Spec Kit repo: https://github.com/github/spec-kit
Step 5: Define the Technical Plan (The "How")
Goal: Translate the "what" and "why" from the approved spec into a concrete technical strategy. This phase generates the high-level engineering blueprint, including architecture, data structures, and setup instructions, all while respecting the Constitution.
Inputs
-
The approved and clarified
spec.mdfrom the previous step. - Your
constitution.mdfile. - The
/plancommand available in your agent chat.
The /plan Command
Once a feature specification exists, this command creates a comprehensive implementation plan:
- Specification Analysis: Reads and understands the feature requirements, user stories, and acceptance criteria
- Constitutional Compliance: Ensures alignment with project constitution and architectural principles
- Technical Translation: Converts business requirements into technical architecture and implementation details
- Detailed Documentation: Generates supporting documents for data models, API contracts, and test scenarios
- Quickstart Validation: Produces a quickstart guide capturing key validation scenarios
Actions
Craft and Run the Plan Prompt: In your agent chat, run the
/plancommand. Your prompt should clearly state the high-level technical choices for the feature. Crucially, you must@mentionthespec.mdfile so the agent uses it as the source of truth.- Your Prompt Example (Perfect):
/plan I am going to use Next.js withstatic site configuration, nodatabases - datais embedded in the contentfor the mock episodes. Site is responsive and ready for mobile. @specs/001-i-am-building/spec.md
- Your Prompt Example (Perfect):
Observe the Agent's Automated Artifact Generation: The
/plancommand is more than a single action. As your output shows, the agent will now execute a script to scaffold a comprehensive technical foundation for the feature:-
It creates the main
plan.mdfile, which serves as the central hub for the implementation strategy. -
It also generates several supporting artifacts inside
the feature's
specs/directory:research.md: Documents initial technical decisions and alternatives considered.data-model.md: Outlines the shape of the data and entities (e.g., the fields for aPodcastEpisode).contracts/: A folder that will contain formal data schemas (like JSON Schema) to validate data.quickstart.md: Provides simple, clear instructions for another developer to set up and run this feature locally.- May generate a contracts/ directory as well.
-
It creates the main
Human Review of the Entire Plan: Your job is now to review this collection of generated documents.
- Start with
plan.md: Read through it. It will have an Execution Flow, a Summary, and a "Constitution Check" section where the agent confirms its plan aligns with your rules. - Review Supporting Docs:
-
Look at
data-model.md. Does the data structure make sense? Does it include all the necessary fields? -
Look at
quickstart.md. Are the setup steps clear and correct?
-
Look at
- Iterate with the Agent: If any part of the plan is
incorrect or incomplete, have a conversation with the agent to refine
it.
- Example Prompt:
In @data-model.md, the Episode entity is missing a 'duration' field. Please add it as a string formatted like "MM:SS" and update the document.
- Example Prompt:
- Start with
Version and Commit the Plan: Once you are satisfied that the plan is complete, actionable, and respects the constitution, mark it as ready.
-
Manually edit
plan.md's header to update its status. - Commit all the newly created and modified plan artifacts to the feature branch.
-
Manually edit
Deliverables
-
A committed
plan.mdfile that serves as the technical blueprint. -
A set of supporting artifacts (
research.md,data-model.md,contracts/,quickstart.md) that provide deep technical context for the feature.
Quality Gates ✅
-
✅ The technical plan directly addresses every functional requirement listed
in
spec.md. -
✅ The agent's "Constitution Check" within
plan.mdhas passed, and you have manually verified its claims. - ✅ The generated data models and contracts are accurate and complete.
- ✅ (For teams) The entire set of plan artifacts has been reviewed and approved in a Pull Request.
Common Pitfalls
- Allowing the plan to restate the spec instead of providing a clear technical path forward.
- Introducing new scope or features that were not defined in the original, approved spec.
- Forgetting about operational concerns like testing strategy, deployment, and monitoring, which should be part of the technical plan.
- Not reviewing the supporting artifacts
(
data-model.md,quickstart.md, etc.), as they are just as important as the mainplan.md.
References
- Spec Kit Plus repo: https://github.com/panaversity/spec-kit-plus
- PyPI package: https://pypi.org/project/specifyplus/
- GitHub blog overview: https://github.blog/ai-and-ml/generative-ai/spec-driven-development-with-ai-get-started-with-a-new-open-source-toolkit/
Step 6: Architecture Decision Records (ADR) - Post-Planning Review
After
/plancompletes, you'll see a suggestion to review for architectural decisions. Run/adrto capture significant technical choices before implementation.
The Problem: Undocumented Decisions
Teams make critical architectural decisions during planning—database choices, API patterns, security models—but these decisions often live only in chat history or planning docs. When revisited months later:
- Why questions have no documented answers
- Tradeoffs are forgotten
- Alternatives that were considered are lost
- Context that influenced the decision is missing
The Solution: Explicit ADR Review After Planning
ADRs (Architecture Decision Records) capture why technical decisions were made, not just what was decided. In Spec Kit, ADRs are created after planning when you have full context.
When ADRs Happen
/constitution → /specify → /plan
↓
📋 Suggestion appears:"Review for architectural decisions? Run /adr"
↓
(You run /adr)
↓
ADRs created in docs/adr/
↓
/tasks → /implement
Key flow:
- Complete planning with
/plan - PHR automatically created (planning session documented)
- ADR suggestion appears (automatic reminder)
- You explicitly run
/adrwhen ready - ADRs created for significant decisions
Why Explicit, Not Automatic?
ADR creation requires careful analysis and judgment. You might need to:
- Discuss decisions with the team first
- Review existing ADRs before creating new ones
- Decide if decisions are truly architecturally significant
The suggestion ensures you don't forget, but you control when it happens.
What Makes a Decision "Architecturally Significant"?
The /adr command uses three criteria (ALL must be true):
1. Impacts How Software is Structured
Does this decision change how engineers write, organize, or architect code?
✅ Yes: Choosing REST vs GraphQL API
❌
No: Which HTTP client library to use
2. Has Notable Tradeoffs or Alternatives
Were alternatives considered? Are there consequences to understand?
✅ Yes: PostgreSQL vs MongoDB (different data models, query
patterns)
❌ No: Using prettier for formatting (no
architectural tradeoff)
3. Will Be Questioned or Revisited Later
Will someone ask "why did we do this?" in 6 months?
✅ Yes: Microservices vs monolith architecture
❌
No: Naming a helper function
Examples
Architecturally Significant (Create ADR):
- Database choice: PostgreSQL vs MongoDB
- Auth strategy: JWT vs sessions
- API pattern: REST vs GraphQL vs RPC
- Deployment: Serverless vs containers
- State management: Redux vs Context API
- Testing strategy: Unit vs integration focus
NOT Architecturally Significant (Skip ADR):
- Variable naming conventions
- Code formatting rules
- Which linter to use
- Specific library versions
- File organization preferences
- Comment style guidelines
ADR Granularity: Clusters vs Atomic Decisions
✅ CORRECT: Document Decision Clusters
Group related technologies that work together as an integrated solution:
Good Example - Frontend Stack:
ADR-0001: Frontend Technology Stack
-Framework: Next.js 14 (App Router)
-Styling: Tailwind CSS v3
-Deployment: Vercel
-State: React Context
Alternatives: Remix + styled-components + Cloudflare
Why this works:
- These technologies are chosen together for integration benefits
- They would likely change together if requirements shift
- ONE decision: "Modern React stack optimized for Vercel"
❌ INCORRECT: Atomic Technology Choices
Don't create separate ADRs for each technology in an integrated solution:
Bad Example:
ADR-0001: UseNext.js Framework
ADR-0002: Use Tailwind CSS
ADR-0003: Deploy on Vercel
ADR-0004: Use React Context
Problems:
- Over-documentation (4 ADRs instead of 1)
- Loses integration story (why these work together)
- Makes decisions seem independent when they're not
Clustering Rules
Cluster together when:
- Technologies are chosen for integration benefits
- They would change together (coupled lifecycle)
- One decision explains why all components fit
Separate ADRs when:
- Decisions are independent (frontend vs backend stacks)
- Could evolve separately (API protocol vs database choice)
- Different teams own different parts
Real-World Examples
| Scenario | Number of ADRs | Titles |
|---|---|---|
| Frontend + Backend | 2 ADRs | "Frontend Stack", "Backend Stack" |
| Auth approach | 1 ADR | "Authentication Architecture" (JWT + Auth0 + session strategy) |
| Data layer | 1 ADR | "Data Architecture" (PostgreSQL + Redis + migration tools) |
| Deployment | 1 ADR | "Deployment Platform" (Vercel + GitHub Actions + monitoring) |
| Microservices split | 1 ADR per boundary | "User Service Boundary", "Payment Service Boundary" |
Industry Standards
This follows Michael Nygard's ADR pattern (2011) and ThoughtWorks' recommendation:
- ADRs document architectural decisions, not technology inventories
- Focus on why, not just what
- Cluster related choices that share context and tradeoffs
References:
How It Works
Step 1: Complete Planning
/plan # Design your feature architecture
You create:
specs/001-auth/plan.md- Main planning documentspecs/001-auth/research.md- Research notes (optional)specs/001-auth/data-model.md- Data models (optional)specs/001-auth/contracts/- API contracts (optional)
Step 2: See ADR Suggestion
After /plan completes, you see:
📋 Planning complete! Review for architectural decisions? Run /adr
This happens automatically—no action needed yet.
Step 3: Run /adr When Ready
When you're ready to review (immediately, or after team discussion):
/adr # Analyzes planning artifacts
Step 4: ADR Analysis Workflow
The /adr command:
Loads planning context
- Reads plan.md, research.md, data-model.md, contracts/
- Understands the feature requirements
Extracts decisions
- Identifies technical choices made during planning
- Examples: "Using PostgreSQL", "JWT authentication", "REST API"
Checks existing ADRs
- Reads docs/adr/ to find related decisions
- Avoids duplicates or superseded decisions
Applies significance test
- For each decision, checks all 3 criteria
- Only proceeds if decision is architecturally significant
Creates ADR files
- Generates docs/adr/NNNN-decision-title.md
- Sequential numbering (0001, 0002, 0003, etc.)
- Complete ADR template with context, decision, consequences
Shows report
- Lists created ADRs
- Lists referenced ADRs
- Confirms readiness for /tasks
Step 5: Review and Proceed
You now have:
- ✅ Planning docs (plan.md, research.md, etc.)
- ✅ PHR of planning session (automatic)
- ✅ ADRs for architectural decisions (explicit via /adr)
Ready for /tasks to break down implementation!
What Each ADR Contains
ADRs are created in docs/adr/ with sequential numbering:
docs/
└── adr/
├── 0001-use-postgresql-database.md
├── 0002-jwt-authentication-strategy.md
└── 0003-rest-api-architecture.md
Each ADR file contains:
# ADR-0002: JWT Authentication Strategy
## Status
Accepted
## Context
We need secure, stateless authentication for our API...
## Decision
We will use JWT (JSON Web Tokens) for authentication...
## Consequences
### Positive
- Stateless authentication (no server-side sessions)
- Works well with microservices
- Industry standard with good library support
### Negative
- Token revocation is complex
- Must manage token refresh carefully
- Token size larger than session IDs
## Alternatives Considered
- Session-based auth (rejected: requires sticky sessions)
- OAuth2 (rejected: overkill for our use case)
## Related
- Spec: specs/001-auth/spec.md
- Plan: specs/001-auth/plan.md
Integrated Workflow
Full SDD Flow with ADRs
1. /sp.constitution
└─→ PHR created (automatic)
2. /sp.specify
└─→ PHR created (automatic)
3. /sp.plan
└─→ PHR created (automatic)
└─→ 📋 "Review for architectural decisions? Run /adr"4. /sp.adr ← YOU RUN THIS
└─→ ADRs created in docs/adr/
└─→ Shows report
5. /sp.tasks
└─→ PHR created (automatic)
6. /sp.implement
└─→ PHR created (automatic)
ADRs Link Planning to Implementation
- Spec (specs/001-auth/spec.md) - WHAT we're building
- Plan (specs/001-auth/plan.md) - HOW we'll build it
- ADR (docs/adr/0002-jwt-auth.md) - WHY we made key decisions
- Tasks (specs/001-auth/tasks.md) - Work breakdown
- Implementation - Actual code
ADRs provide the critical WHY context that specs and plans don't fully capture.
Common Scenarios
Scenario 1: Simple Feature (No ADRs Needed)
/plan # Design simple CRUD endpoint# → 📋 Suggestion: "Run /adr"
/adr # Analyzes planning artifacts# → "No architecturally significant decisions found. Proceed to /tasks."
Result: No ADRs created (implementation details don't need ADRs)
Scenario 2: Complex Feature (Multiple ADRs)
/plan # Design new microservice with database and API
# → 📋 Suggestion: "Run /adr"
/adr # Analyzes planning artifacts
# → Creates:# - docs/adr/0005-use-event-sourcing.md# - docs/adr/0006-kafka-message-broker.md# - docs/adr/0007-graphql-api.md# → "3 ADRs created. Proceed to /tasks."
Result: Multiple ADRs for significant architectural choices
Scenario 3: References Existing ADR
/plan # Design feature using existing patterns
# → 📋 Suggestion: "Run /adr"
/adr # Analyzes planning artifacts
# → "Referenced existing ADRs:# - docs/adr/0002-jwt-authentication.md# - docs/adr/0003-rest-api-pattern.md# No new ADRs needed. Proceed to /tasks."
Result: No new ADRs (reusing established patterns)
Troubleshooting
"No plan.md found"
Cause: Running /adr before /plan
Solution: Complete planning first:
/plan Design the feature
"No architecturally significant decisions found"
Cause: Planning contains only implementation details
Solution:
Normal! Not every feature needs ADRs. Proceed to /tasks
Too Many ADRs Created
Cause: Including implementation details in planning
Solution:
Focus planning on architecture, not code-level decisions
ADR Duplicates Existing ADR
Cause: Didn't reference existing patterns
Solution:
Review docs/adr/ before planning new features
Best Practices
Do Create ADRs For:
✅ Technology choices (databases, frameworks, platforms)
✅ Architectural
patterns (microservices, event-driven, layered)
✅ Security models (auth
strategies, encryption approaches)
✅ API contracts (REST vs GraphQL,
versioning strategies)
✅ Data models (normalization, schema design)
✅
Infrastructure decisions (deployment patterns, scaling strategies)
Don't Create ADRs For:
❌ Code style and formatting
❌ Library choices (unless architectural
impact)
❌ Variable/function naming
❌ Implementation algorithms
❌
Testing implementation details
❌ Documentation formats
When in Doubt:
Ask: "Will someone question this decision in 6 months?"
- Yes → Create ADR
- No → Skip ADR
Summary
ADRs capture the why behind architectural decisions:
✅ Automatic suggestion after /plan - Never
forget to review
✅ Explicit execution via
/adr - You control timing
✅
Significance test - Only creates ADRs for important
decisions
✅ Sequential numbering - Consistent IDs
(0001, 0002, 0003, etc.)
✅ Location - All ADRs in
docs/adr/ (centralized)
✅ Full context - Links to
specs, plans, alternatives, consequences
✅
Team alignment - Documented decisions reduce debate
Key Workflow:
- Complete
/plan→ Suggestion appears - Run
/adrwhen ready → ADRs created (if significant) - Proceed to
/tasks→ Implementation breakdown
Remember: ADRs are for architecture, not implementation. Focus on decisions that:
- Change how code is structured
- Have notable tradeoffs
- Will be questioned later
Start creating ADRs today to document the why behind your technical choices! 🚀
Step 7: Generate the Actionable Task List
Goal: Deconstruct the technical plan into a sequence of
small, verifiable, and testable tasks. This tasks.md file is the
final blueprint that the AI agent will follow to build the feature, turning
the strategic "how" into a tactical "how-to."
Inputs
- The approved
spec.md. -
The approved
plan.mdand all its supporting artifacts (data-model.md, etc.). -
Your
constitution.md(specifically, any rules on task sizing). - The
/tasksslash command in your agent chat.
The /tasks Command
After a plan is created, this command analyzes the plan and related design documents to generate an executable task list:
- Inputs: Reads plan.md (required) and, if present, data-model.md, contracts/, and research.md
- Task Derivation: Converts contracts, entities, and scenarios into specific tasks
- Parallelization: Marks independent tasks [P] and outlines safe parallel groups
- Output: Writes tasks.md in the feature directory, ready for execution by a Task agent
Actions
Generate the Initial Task Breakdown: In your agent chat, run the
/taskscommand. You can give it the context of the entire feature directory to ensure it has all the necessary information.- Your Prompt Example (Perfect):
/tasks Follow and breakthis down into tasks@specs/001-i-am-building/
- Your Prompt Example (Perfect):
Observe the Agent's Task Generation: The AI will now process the spec, plan, and constitution. It will generate a single, comprehensive
tasks.mdfile inside the feature's directory. As your output shows, it will:- Structure the Work: Break down the project into logical phases (e.g., Phase 3.1: Setup, Phase 3.2: Tests First, Phase 3.3: Core Implementation).
- Define Individual Tasks: Create specific, actionable
tasks with unique IDs (e.g.,
T001,T002). Each task will map to a concrete action, like "Initialize Next.js app skeleton" or "Create contract validation script." - Suggest an Execution Strategy: It may provide guidance
on which tasks can be run in parallel
[P]versus those that must be run sequentially.
Human Review and Refinement: This is your final chance to review the construction plan before the "building" starts.
- Open the newly generated
tasks.mdfile. - Check for Completeness: Read through the task list. Does it cover every functional requirement from the spec and every technical component from the plan? Did the agent forget anything (like documentation or final cleanup)?
- Validate the Order: Does the sequence of tasks make sense? For a TDD project, the "Tests First" tasks should come before the "Core Implementation" tasks.
- Check Task Size: Is any single task too large? For
instance, if you see a task like
T015: Implement entire frontend, that's a red flag. You should instruct the agent to break it down further.- Example Prompt:
"The task T015 is too large. Break it down into separate tasks for creating the header, the footer, the hero component, and the episode list component. Update tasks.md."
- Example Prompt:
- Add Non-Code Tasks: The agent might forget
process-oriented tasks. Manually add them to the list if needed:
- [ ] T0XX: Create a PR for review onceall coding tasks are complete.
- Open the newly generated
Commit the Final Task List: Once you are confident the
tasks.mdis complete and actionable, commit it to the feature branch. This document now becomes the locked-down "script" for the implementation phase.
Deliverables
-
A final, reviewed, and committed
tasks.mdfile that provides a clear, step-by-step implementation checklist.
Quality Gates ✅
-
✅ The task list completely covers all requirements from the
spec.mdandplan.md. - ✅ Each task is small, well-defined, and has a clear "definition of done" (often, passing a specific test).
- ✅ The task sequence is logical and respects dependencies.
-
✅ (For teams) The
tasks.mdfile has been approved by the tech lead or relevant team members.
Common Pitfalls
- Accepting the agent-generated list without review. AI agents can sometimes create vague ("polish the UI") or overlapping tasks.
- Forgetting to include tasks for crucial non-feature work,
such as running tests, creating documentation (
README.md), and setting up CI/CD. - Creating tasks that are too large, which makes them difficult to review and validate, defeating the purpose of the incremental loop.
References
- Spec Kit Plus repo: https://github.com/panaversity/spec-kit-plus
- PyPI package: https://pypi.org/project/specifyplus/
- Original Spec Kit repo: https://github.com/github/spec-kit
Step 8: Implement, Test, and Validate
Goal: Direct the AI agent to execute the entire task list, turning the specification and plan into a fully functional, tested, and verifiable piece of software. In this phase, your role shifts from writer to director and quality assurance.
Inputs
-
The complete and finalized
tasks.mdchecklist from the previous step. -
The full context of the feature directory (
spec.md,plan.md,constitution.md). - Your agent chat (Cursor, Gemini CLI, etc.) with a running terminal.
Actions
Initiate the Implementation: Give the AI agent a single, clear command to begin the work. You are handing it the entire checklist and authorizing it to proceed.
- Your Prompt Example (Perfect):
/implement Implement the tasks for this project andupdate the task listas you go. @tasks.md
- Your Prompt Example (Perfect):
Monitor the Agent's Execution: Your primary role now is to observe. The AI will announce what it's doing, following the
tasks.mdfile as its script. You will see a flurry of activity in the terminal and in your file tree as the agent:- Sets up the environment (e.g., runs
npm install, configurestsconfig.json). - Writes failing tests first (adhering to the TDD principle in your constitution).
- Implements the core logic to make the tests pass.
- Creates all necessary files: components, pages, utility functions, data files, and stylesheets.
- Refines and polishes the code based on instructions in the task list.
- Sets up the environment (e.g., runs
Perform the Interactive Review Loop: The agent will not run completely on its own. It will periodically stop and present you with a set of file changes. This is your critical review checkpoint.
- For each set of changes, review the code diff.
- Ask yourself: Does this code correctly implement the specific task? Does it adhere to our Constitution? Is it clean and maintainable?
- Click "Keep" to approve the changes and let the agent continue to the next task. If something is wrong, click "Undo" and provide a corrective prompt.
Final Validation (The Human's Turn): After the agent reports that all tasks are complete (as seen in your example output), you perform the final verification. Do not blindly trust the AI's summary.
- Run the Tests Yourself: In the terminal, run the test
suite to get an objective confirmation that everything works as
specified.
npm test - Run the Application: Start the development server and
interact with the feature yourself.
Open the browser and click around. Does it look and feel like the experience you envisioned in yournpm run devspec.md? This is the ultimate test.
- Run the Tests Yourself: In the terminal, run the test
suite to get an objective confirmation that everything works as
specified.
Commit the Working Software: Once you are fully satisfied, commit all the work to the feature branch.
- Example Commit Message (from your output):
git commit -m "feat(podcast): implement modern podcast website"
- Example Commit Message (from your output):
Deliverables
- A complete, working, and tested implementation of the feature on its own Git branch.
-
An updated
tasks.mdfile where all tasks are checked off, providing a clear audit trail of the work performed.
Quality Gates ✅
- ✅ All automated tests created during the process pass successfully in CI and locally.
- ✅ You have manually reviewed the agent's code at each step of the implementation loop.
-
✅ You have manually run and interacted with the final application,
confirming it meets the user experience outlined in the
spec.md.
Common Pitfalls
- "Fire and Forget": Giving the
/implementcommand and not actively monitoring and reviewing the agent's progress. This can lead to the agent going down the wrong path. - Ignoring Failing Tests: If the agent's implementation fails a test and it can't fix it, it's your job to intervene, diagnose the problem, and provide guidance.
- Skipping the Final Manual Review: Relying solely on automated tests might miss visual bugs, awkward user flows, or other user experience issues. Always look at the final product.
References
- Spec Kit Plus repo: https://github.com/panaversity/spec-kit-plus
- PyPI package: https://pypi.org/project/specifyplus/
- Original Spec Kit repo: https://github.com/github/spec-kit
Step 9: Clarify & Analyze Spec Deep Dive
Two key commands that enhance the Spec Kit workflow are /clarify and /analyze. These commands help to mitigate the risk of "underspecification," where a lack of detail can lead to incorrect assumptions and rework.
Goal: keep your spec, plan, and task list honest by driving
ambiguity to zero with /clarify and proving coverage with
/analyze before you advance to planning or implementation.
Core concepts
Clarification queue: The /clarify command helps to resolve ambiguities in the specification by engaging the user in a structured dialogue. It presents a series of up to five questions with multiple-choice answers to refine underspecified areas. The user's selections are then logged in the specification, creating a clear record of the decisions made.
Coverage check: /analyze command provides a crucial check for consistency across the various project artifacts. It examines the spec.md, plan.md, and tasks.md files to identify any inconsistencies, gaps, or violations of the project's defined "constitution" – a set of non-negotiable principles.
Inputs
-
Current
spec.md,plan.md,tasks.md - Latest
/clarifytranscript or log - Most recent
/analyzereport - Stakeholder feedback or product notes that triggered rework
- Relevant constitution clauses (change management, Article III test-first, Article IX integration-first)
Actions
- Collect open questions
-
Export the
/clarifylog and tag each item by theme (scope, UX, data, compliance). - Assign owners and due dates so nothing stalls.
-
Export the
- Run
/clarifyand resolve-
Execute
/clarify @specs/<feature>/spec.md(include plan/tasks if required) to refresh the queue. - Capture answers directly in the spec or supporting docs; add ADRs when the decision is architectural.
-
Remove or annotate any
[NEEDS CLARIFICATION]markers as you close them.
-
Execute
- Update downstream artifacts
-
Sync
plan.md,data-model.md, andtasks.mdwith the new rulings. - Update constitution notes if patterns suggest a new rule or amendment.
-
Sync
- Run
/analyzefor coverage-
Execute
/analyze @specs/<feature>/and review every red or yellow item. - Add tasks, clarify requirements, or adjust tests until the report is clean.
-
Execute
- Lock the loop
- Mark the clarification log with final statuses (Open → Answered → Deferred).
-
Store the latest
/clarifyand/analyzeoutputs in.specify/memory/(or your docs location) for traceability. - Signal “Ready for Plan/Implementation” only when both queues are clear.
Deliverables
- Updated spec/plan/tasks reflecting resolved questions and new decisions
- Clarification log summarizing questions, owners, outcomes, and links
-
Clean
/analyzereport archived alongside the feature artifacts - ADRs or constitution updates for any significant policy or architectural change
Quality Gates ✅
-
No unresolved items remain in
/clarify; deferred questions have owners and deadlines /analyzereports zero blocking gaps; warnings are either fixed or explicitly accepted with rationale- Commit history references the clarification or ADR that justified each change
- Constitution reflects any repeat insights discovered during the loop
Common pitfalls
-
Skipping
/clarifyafter spec edits and letting assumptions leak into implementation -
Ignoring
/analyzeresults or treating them as optional advice - Leaving answers in chat logs instead of updating the spec or ADRs
- Forgetting to store transcripts, making decision trails impossible to audit
References
- Spec Kit Plus repo: https://github.com/panaversity/spec-kit-plus
- PyPI package: https://pypi.org/project/specifyplus/
- Spec Kit Clarify/Analyze Demo (optional deep dive): https://www.youtube.com/watch?v=YD66SBpJY2M
- Spec Kit Nine Articles: https://github.com/panaversity/spec-kit-plus/blob/main/spec-driven.md#the-nine-articles-of-development
- Original Spec Kit repo: https://github.com/github/spec-kit