Why We Built Our Own Spec Kit for Brownfield Projects

The spec kits that are currently available have several limitations and drawbacks. I'll use the GitHub Spec Kit as an example.

In our organization, we tried using GitHub Spec Kit across all of our brownfield projects. We used it for around two months, and during that time we encountered several challenges and limitations. As a result, we started looking for solutions to these problems.

We explored nearly every resource available online and eventually decided to build our own custom spec kit that specifically addresses the challenges we faced.

Our entire team, along with several engineering teams, contributed to building this spec kit. Over the last six months, we have gone through multiple iterations and refinements. Today, I can confidently say that we are in a much better position in terms of token consumption, development speed, and reducing unnecessary vibe-coding sessions. Most importantly, it works effectively for brownfield projects.

Every file was carefully designed, and every guardrail was thoroughly discussed. We made sure the guardrails provided enough flexibility for AI to operate effectively without being unnecessarily constrained.

The Problems We Observed

Here are some of the major issues we encountered with existing spec kits.

1. Existing Spec Kits Are Not Designed for Brownfield Projects

Most spec kits assume that a project has a complete and accurate knowledge base. In reality, brownfield projects often contain legacy code, undocumented decisions, and patches that may conflict with the project's stated architecture or constitution.

As a result, any generated constitution will rarely be 100% complete or accurate.

2. The Constitution Is Too Restrictive

GitHub Spec Kit relies heavily on a strict constitution. While this can be useful for maintaining consistency, it often makes AI overly restrictive and prevents it from making necessary modifications.

In many feature development scenarios, we need to modify existing implementations, refactor code, or evolve architectural decisions. The constitution-driven workflow can make these changes difficult because the AI becomes overly cautious about violating existing rules.

3. A Single Constitution File Does Not Scale

Another challenge is that GitHub Spec Kit maintains a single constitution file for the entire codebase.

This becomes problematic as the project grows. Even a small bug fix may require loading a massive constitution file into context, consuming a large number of tokens despite only a small portion being relevant.

We tested this approach with our BusinessRuleApi project, one of the largest repositories we maintain. It is a rule engine responsible for generating execution rules. As we continued expanding the constitution, it kept growing larger and larger. Once it reached approximately 10,000 lines—and still covered less than 10% of the project—we decided that maintaining a single constitution file was no longer practical.

The token consumption simply became too expensive.

Summary of the Challenges

After using GitHub Spec Kit for roughly two months, these were the primary issues we identified:

Too-Strict Guardrails

The guardrails often restrict AI from making changes to existing code, even when those changes are necessary for feature development. This issue is most noticeable with the constitution.md approach used by GitHub Spec Kit.

Large Context Consumption

As the constitution grows, it becomes a massive file that must be processed repeatedly.

A bug fix that should ideally consume 300K–500K tokens can easily require 3M–4M tokens because the AI has to process the entire constitution, even when only a small section is relevant.

Dependence on High-End Models

In our experience, GitHub Spec Kit generally requires higher-end models such as Sonnet and, in many cases, Opus.

This is primarily because the context window requirements become very large. As a result, operational costs increase significantly.

With our new spec kit, we are able to use Cursor Composer effectively, which is substantially cheaper because it requires less context and does not depend on premium models for most tasks.

Limited Brownfield Support

Most importantly, existing spec kits are not optimized for brownfield projects.

The two main reasons we moved away from existing solutions were:

High operational costs.
Long vibe-coding sessions that increased developer effort.

Our custom spec kit is intentionally lightweight in both workflow and context requirements, while being specifically designed to handle brownfield project challenges.

What We Changed

We created two primary workflows:

A moderate workflow for feature development.
A lightweight workflow for bug fixes and smaller tasks.

This approach speeds up execution, reduces developer effort, and significantly lowers token consumption.

Knowledge Architecture

The biggest architectural change we introduced was restructuring project knowledge.

Think of it like a book:

A table of contents.
Chapters.
Subsections.
References to detailed pages.

Instead of storing everything in a single document, we divide project knowledge into major flows, subflows, and detailed reference documents.

This allows AI to retrieve only the information that is relevant to the task at hand instead of processing unnecessary content.

As a result, token usage is dramatically reduced.

For example, in our BusinessRuleApi project, GitHub Spec Kit often consumed between 2M and 4M tokens for a single request because of the amount of context involved.

With our custom spec kit, the same tasks typically consume between 700K and 1.5M tokens. Simple bug fixes usually require only 600K–700K tokens, which is significantly lower.

Flexible Guardrails

Another major improvement is our ability to choose how strict the guardrails should be.

We found that extremely strict guardrails frequently blocked larger development efforts that required modifications to existing infrastructure or architecture.

To address this, we categorized guardrails into different modes, such as:

Technical Analyst
Technical Devil's Advocate
Additional specialized review profiles

Each profile approaches the project differently and provides a different level of scrutiny.

Depending on the task, we can select the most appropriate profile. This gives AI more flexibility where needed while still maintaining quality and oversight.

Flow Overview

Below is an file diagram of our spec kit.

What's Next

This article is still a work in progress.

Upcoming sections will cover:

Complete architecture breakdown
Design decisions and rationale
Pros and cons
Lessons learned
Detailed implementation examples

I'll update this document shortly with the remaining sections.

Why We Built Our Own Spec Kit for Brownfield Projects

The Problems We Observed

1. Existing Spec Kits Are Not Designed for Brownfield Projects

2. The Constitution Is Too Restrictive

3. A Single Constitution File Does Not Scale

Summary of the Challenges

Too-Strict Guardrails

Large Context Consumption

Dependence on High-End Models

Limited Brownfield Support

What We Changed

Knowledge Architecture

Flexible Guardrails

Flow Overview

What's Next

Comments

More from this blog

Blog 1 | Crafting an API Endpoint | at @AsyncAPI

Command Palette

The Problems We Observed

1. Existing Spec Kits Are Not Designed for Brownfield Projects

2. The Constitution Is Too Restrictive

3. A Single Constitution File Does Not Scale

Summary of the Challenges

Too-Strict Guardrails

Large Context Consumption

Dependence on High-End Models

Limited Brownfield Support

What We Changed

Knowledge Architecture

Flexible Guardrails

Flow Overview

What's Next

Comments

More from this blog