Spec Driven development

Over the past month, I’ve experimented with various approaches to make AI-driven development more productive—shifting from vibe coding to structured, context- and spec-driven workflows. This journey reflects a clear evolution: from ad-hoc solutions to professionalized workflows, and finally to streamlined automation.

In my blog post from September 2025, I outlined how context engineering is transforming development workflows. Here’s what I implemented to put theory into practice:

Memory-Bank for cline.bot (with custom optimizations)
CLAUDE.md: Adapting the memory-bank concept for Claude
Using the BMAD Method
Get-Shit-Done (GSD) Framework

How do you manage context?

One of the most interesting challenges in working with LLMs for development is managing the context window. How do you keep it just detailed enough for the LLM to understand your goals, while leaving room for analysis and execution? Let’s call it an Agent — you delegate tasks, and it handles them for you.

To make this work, you need to provide the Agent with:

Information about the application’s purpose and architecture
Guidelines for architectural decisions
Instructions for documentation and code changes
Architectural Decision Records (ADRs) to track what worked and what didn’t

Feeding this information to the Agent every time is cumbersome. That’s where Memory-Banks come in.

Memory-Bank for cline.bot

The cline.bot documentation provides a solid foundation for setting up a memory-bank. Over time, I refined the structure to ensure the update memory-bank command was always called, maintaining context consistency.

Here’s how it helps:

Project Brief: A high-level overview of the project’s goals and scope.
System Pattern: A brief architectural overview of the application.
Tech Context: Technical details, dependencies, and constraints.
Active Context: What we’re currently working on, so we don’t need to repeat ourselves.

With cline.bot, you can use multiple LLM backends and even set rules, like always starting with a test. The memory-bank makes it easier to use different models for planning and execution. For example, use a high-end model for planning and a cheaper model for execution. Overall, the memory-bank helps models understand the current task and stick to quality constraints.

Can we do this with Copilot?

While experimenting with different models and tools, I wanted to see if the memory-bank concept could also work with GitHub Copilot. Using the chat mode, I created a personal agent with a system prompt that included the memory-bank and instructions to update it after each task.

## Memory Bank Files & Purpose:
Refer to the following files in the `memory-bank/` directory for specific context. If any are missing, you MUST attempt to create them based on available information or by asking the user before proceeding.

- **`projectbrief.md`**: Overall project goals, scope, and success criteria. (High-level understanding).
- **`productContext.md`**: The "why" behind the project, problems it solves, how it should work. (User needs, product features).
- **`techContext.md`**: Technologies used, development setup, technical constraints. (Implementation details, dependencies, limitations).
- **`systemPatterns.md`**: System architecture, key technical decisions, design patterns, component relationships. (System structure and organization).
- **`activeContext.md`**: Current work focus, recent changes, next steps, active decisions. (**Your primary source of truth for current state**).
- **`progress.md`**: What works, what's left to build, current status, known issues. (Project status assessment).

How about CLAUDE.md?

I adapted the Copilot agent’s prompt for the CLAUDE.md file. Now, Claude also references the memory-bank and aligns its tasks with the project’s goals. If it doesn’t, a simple update memory-bank command brings it back on track.

How do we keep the quality high?

The beauty of these contexts is that you don’t need to repeat yourself. However, over time, I noticed that quality could degrade when working with multiple models and tools. That’s when I started experimenting with the BMAD Method and later, Get-Shit-Done (GSD).

First Impressions: Why BMAD Stands Out

My first impression of the BMAD Method (version 4 or earlier): **It’s impressive!** Why? The creator (yes, it always starts with one person 😉) designed a full **team of specialized agents**—essentially system prompts for different roles, but that’s all you need. Here’s the list of agents BMAD provides out of the box:

Analyst – for market and competitive analysis
Architect – for system architecture
UX Expert – for user experience and interface design
Project Manager – for project planning and execution
Product Owner – for requirements and prioritization
Scrum Master – for agile processes
Developer – for implementation
Infra DevOps Specialist – for deployment and infrastructure
QA Specialist – for quality assurance and testing

That’s a full team, isn’t it? 😄 The method is so powerful that it even requires a flowchart to keep track of agent orchestration—otherwise, you might lose yourself in the process:

graph TD
    A["Start: Project Idea"] --> B{"Optional: Analyst Research"}
    B -->|Yes| C["Analyst: Brainstorming (Optional)"]
    B -->|No| G{"Project Brief Available?"}
    C --> C2["Analyst: Market Research (Optional)"]
    C2 --> C3["Analyst: Competitor Analysis (Optional)"]
    C3 --> D["Analyst: Create Project Brief"]
    D --> G
    G -->|Yes| E["PM: Create PRD from Brief (Fast Track)"]
    G -->|No| E2["PM: Interactive PRD Creation (More Questions)"]
    E --> F["PRD Created with FRs, NFRs, Epics & Stories"]
    E2 --> F
    F --> F2{"UX Required?"}
    F2 -->|Yes| F3["UX Expert: Create Front End Spec"]
    F2 -->|No| H["Architect: Create Architecture from PRD"]
    F3 --> F4["UX Expert: Generate UI Prompt for Lovable/V0 (Optional)"]
    F4 --> H2["Architect: Create Architecture from PRD + UX Spec"]
    H --> Q{"Early Test Strategy? (Optional)"}
    H2 --> Q
    Q -->|Yes| R["QA: Early Test Architecture Input on High-Risk Areas"]
    Q -->|No| I
    R --> I["PO: Run Master Checklist"]
    I --> J{"Documents Aligned?"}
    J -->|Yes| K["Planning Complete"]
    J -->|No| L["PO: Update Epics & Stories"]
    L --> M["Update PRD/Architecture as needed"]
    M --> I
    K --> N["📁 Switch to IDE (If in a Web Agent Platform)"]
    N --> O["PO: Shard Documents"]
    O --> P["Ready for SM/Dev Cycle"]

    style A fill:#f5f5f5,color:#000
    style B fill:#e3f2fd,color:#000
    style C fill:#e8f5e9,color:#000
    style C2 fill:#e8f5e9,color:#000
    style C3 fill:#e8f5e9,color:#000
    style D fill:#e8f5e9,color:#000
    style E fill:#fff3e0,color:#000
    style E2 fill:#fff3e0,color:#000
    style F fill:#fff3e0,color:#000
    style F2 fill:#e3f2fd,color:#000
    style F3 fill:#e1f5fe,color:#000
    style F4 fill:#e1f5fe,color:#000
    style G fill:#e3f2fd,color:#000
    style H fill:#f3e5f5,color:#000
    style H2 fill:#f3e5f5,color:#000
    style Q fill:#e3f2fd,color:#000
    style R fill:#ffd54f,color:#000
    style I fill:#f9ab00,color:#fff
    style J fill:#e3f2fd,color:#000
    style K fill:#34a853,color:#fff
    style L fill:#f9ab00,color:#fff
    style M fill:#fff3e0,color:#000
    style N fill:#1a73e8,color:#fff
    style O fill:#f9ab00,color:#fff
    style P fill:#34a853,color:#fff

I applied BMAD to a brownfield project and used it to clarify goals and tasks. After initializing the project:

@pm *create-brownfield-prd

Files were generated in the BMAD folders. The Analyst agent’s brainstorming feature was particularly valuable—so much so that I created a custom agent in my Mistral subscription.

The Analyst created a Product Requirements Document (PRD) that even considered user personas. The UX Expert improved the app’s screen flow, and the Product Owner generated epics and stories from my existing to-dos.

However, I hit a wall: managing multiple agents became overwhelming. After a month, I was frustrated—did I call the Scrum Master for the stories? Did the PO verify them? Did QA check the tests? Are the GitHub issues linked to the commits? The overhead became unmanageable, and I took a two-month break.

Get-Shit-Done to the rescue!

After my frustration with BMAD, I turned to the Get-Shit-Done (GSD) Framework. It promised to address my core issue:

Why I Built This

I’m a solo developer. I don’t write code—Claude Code does.

Other spec-driven development tools exist—BMAD, Speckit—but they make things way more complicated than necessary. I don’t want enterprise theater. I just want to build great things that work.

So I built GSD. The complexity is in the system, not in your workflow. What you see: a few commands that just work.

The system gives Claude everything it needs to do the work and verify it. I trust the workflow. It just does a good job.

No enterprise roleplay bullshit. Just an incredibly effective system for building cool stuff consistently using Claude Code.

— TÂCHES

GSD worked seamlessly with my brownfield project. After some tweaking, it used existing sources and documentation to create a new .planning folder with five key documents:

PROJECT.md
STATE.md
REQUIREMENTS.md
ROADMAP.md
MILESTONES.md

After initialization, the most important commands are:

/gsd:new-milestone

This creates a new milestone. The workflow follows a simple loop: discuss → plan → execute → verify until the milestone is complete. GSD manages the agents for you, so you can focus on getting things done.

Conclusion

If you want to distinguish yourself from “vibe coders” who unknowingly introduce security risks, you’ll want to adopt spec-driven or context-driven development.

How does it differ? With context-driven development, you focus on planning and documenting your goals. To avoid repetition, you need tools like memory-banks or frameworks like BMAD and GSD.

My Current „Perfect Setup“ (February 2026)

Disclaimer: In the AI world, things evolve fast—this setup works today, but may not in five months.

After countless experiments with BMAD, GSD, and memory banks, here’s the hybrid workflow that works best for me—optimized for greenfield projects and context-driven development:

1. Brainstorming: BMAD’s Analyst (or Mistral Le Chat)

I start with BMAD’s Analyst agent (or a fine-tuned Mistral prompt) for structured ideation. It generates:

Personas and user journeys
Core features as user stories
UI scribbles (text-based descriptions for wireframes)
Technical constraints and architecture sketches

Example Prompt for Mistral:

"Act as a senior product analyst. Brainstorm a greenfield project for [target audience]. Include:
1) User personas
2) Core features as user stories
3) UI scribbles (text-based descriptions for wireframes)
4) Technical constraints.
Focus on [specific problem]."

Output: A brainstorming.md with clear epics/stories for GSD + screenshots/scribbles for the Memory-Bank.

2. Execution: Get-Shit-Done (GSD)

GSD requires rich context to be effective. Without brainstorming results, you risk:

Token waste (Opus burns through your budget quickly)
Context drift (the agent hallucinates unrelated features)

My GSD Workflow:

Start with /gsd:new-milestone using brainstorming output
Follow the phase loop: discuss → plan → execute → verify
Use /gsd:set-profile budget for routine tasks

3. Documentation & Memory-Bank

The balancing act: Too much context clutters the window; too little leads to duplicate work.

My Rules:

Store only: Decisions, screenshots, changelogs
Exclude: Generic docs (→ separate wiki), outdated scribbles
Auto-generate architecture_overview.md every 3-5 commits

4. Quality Gates

GSD sometimes forgets context. My safeguards:

Weekly check: „Do recent changes conflict with systemPatterns.md?“
Pre-merge: Verify with a separate LLM (e.g., Gemini in „act mode“)

5. Escape Hatches

If GSD misbehaves:

Fallback to cline.bot: cline update-memory --files brainstorming.md,architecture_overview.md
Code critical parts manually (AI is a multiplier, not a replacement)

Key Lessons Learned

1. AI workflows are like kitchen gadgets:

BMAD = Food processor (powerful but messy)
GSD = Knife set (precise but requires skill)
Memory-Bank = Spice rack (overstuff it and you’ll lose track)

2. 80/20 rule applies: 80% of quality comes from 20% of the context (brainstorming.md + systemPatterns.md).

3. AI is a copilot with ADHD: You’re still the pilot. Know when to take control.

In the end

Depending on your Claude plan, be mindful of token usage. The latest GSD version includes a new feature to switch model profiles:

/gsd:set-profile <profile>

Enjoy experimenting with these tools, and let me know what works for you!

[notI`z. `blok]