How We Built an AI That Understands What Clients Actually Want (Without the Back-and-Forth)

The Problem: Lost in Translation

Every web developer knows this pain: A client fills out an intake form, and you get responses like:

"I want a modern, professional website that really pops. Something clean but also engaging. You know, like Apple but warmer. Oh, and we need booking."

Great. What does "pops" mean? What kind of booking? Is "modern professional warmth" even possible?

The traditional solution: Schedule a 60-minute call to ask clarifying questions, take notes, write it up, send it back for approval, revise, repeat. By the time you understand what they actually want, you've burned 4-6 hours of back-and-forth.

There had to be a better way.

Why a Bot? (And Not Just a Better Form)

Our first instinct was to improve the form itself—better questions, conditional logic, smart defaults. But we quickly realized: The problem isn't the questions. The problem is that human intent is messy.

Clients don't think in website-developer terms. They think in:

"I want customers to find us easily"
"We're different because we do same-day service"
"Something like Patagonia's vibe but for plumbing"

A traditional form forces them into checkboxes and dropdowns. A chatbot conversation lets them talk naturally—but then you're left with unstructured text soup.

The Real Challenge

We weren't building a form-to-website pipeline. We were building a requirements compiler—something that could take messy human input and transform it into machine-safe specifications.

That's a fundamentally different problem.

The Insight: Evidence ≠ Truth

Here's where things got interesting. We stopped treating client input as "the requirements" and started treating it as evidence of requirements.

Think about it: When a client says "modern minimalist design," they're providing evidence that they care about aesthetics. But what does "modern minimalist" actually mean?

Flat design with lots of white space (Apple-style)?
Warm minimalism with earth tones (Patagonia-style)?
Bold brutalist with strong typography?

All three are "modern minimalist." The client's input is evidence—it needs interpretation, validation, and sometimes clarification.

This shift in thinking led to our four-layer architecture:

1. EVIDENCE    (what client said - raw, messy, contradictory)
   ↓
2. INTENT      (what we can extract - normalized, validated)
   ↓  
3. DECISIONS   (what we commit to - resolved, specific)
   ↓
4. ARTIFACTS   (what gets built - specs, designs, code)

Each layer can be improved independently without collapsing the system.

Where AI Helps (And Where It Doesn't)

This is crucial: We don't use AI everywhere. AI is expensive, non-deterministic, and sometimes hallucinates. We use it only where human interpretation is genuinely required.

What We DON'T Send to AI

Structured data like:

Business name: "Sunrise Bakery"
Email: hello@sunrisebakery.com
Phone: (530) 555-0199
Pages needed: Home, Menu, About, Contact

This goes through deterministic code—normalization, validation, and direct extraction. Why pay an API call for something a regex can do in milliseconds?

What We DO Send to AI

Unstructured descriptions like:

"We're a family bakery that's been in Auburn, CA for 30 years. We make everything from scratch daily using grandma's recipes. People love us because we remember their names and their usual orders. We want the website to feel cozy and welcoming, not sterile like those chain bakeries."

This needs interpretation. An AI agent breaks this into atomic claims:

"Family bakery with 30 years of Auburn history"
"Makes products from scratch daily using traditional recipes"
"Values personal connection and remembering customers"
"Brand personality should be cozy and welcoming"
"Must differentiate from chain bakery sterility"

Now we have specific, actionable requirements instead of a paragraph.

The Two-Track Processing System

Imagine two assembly lines running in parallel:

Track 1: Structured Facts (code)

Business name → Validation → Storage
Contact info → Format normalization → Storage  
Page selections → Array parsing → Storage

Fast, deterministic, cheap.

Track 2: Unstructured Text (AI)

"Our bakery is..." → AI atomization → Validation → Storage
"We want customers to..." → AI interpretation → Validation → Storage

Slower, probabilistic, valuable.

Both tracks merge before conflict detection. This way we get:

✅ Zero data loss (structured facts preserved exactly)
✅ Rich interpretation (AI extracts nuance from descriptions)
✅ Cost efficiency (AI only where needed)
✅ Speed (80% of data processed instantly)

Conflict Detection: The AI's Most Valuable Job

Here's where the system gets smart. Once we have 60-70 atomic claims extracted from the intake, we run them through a Conflict Detector agent.

This agent looks for:

Contradictions:

Claim #12: "Budget-conscious pricing is a key differentiator"
Claim #47: "Premium, high-end service experience"

Wait—which is it? Budget or premium?

Ambiguities:

Claim #23: "Modern design aesthetic"

Modern like brutalist concrete? Modern like glassmorphism? Modern like flat design from 2015?

Critical Gaps:

42 claims about the business, but zero claims about who the customers are.

How do you design a website without knowing the audience?

The AI doesn't just flag these—it suggests clarifying questions with multiple-choice options and provides fallback resolutions if the client doesn't respond.

The Clarification Loop (Without the Endless Back-and-Forth)

When conflicts or gaps are detected, here's what happens:

1. Smart Triage

Not every question gets asked. The system categorizes issues:

Critical (blocks all work): "What's your business name?"
High (blocks major features): "Do you need booking or just a contact form?"
Medium (has reasonable defaults): "Which shade of minimalism?"

Medium-severity issues get auto-resolved with fallbacks, which are shown to the client for approval:

✅ Our decision: We'll use warm minimalism (Patagonia-style) based on your reference to "welcoming" and "cozy"

Want to change this? [Click here]

2. Approval Flow, Not Re-Asking

Instead of asking the same question again, we show our interpretation:

✓ Keep our decision → Proceed
✗ I want to change it → Opens text field

This reduces a 10-question follow-up to maybe 1-2 actual overrides.

3. Structured Responses

Questions with options use button values, not free text:

Client sees: "Keep playful tone - it makes us memorable"
System receives: OPTION_A

Then we use a claim template to hydrate the structured choice into a full requirement:

OPTION_A → "Brand tone should be playful and edgy, emphasizing memorable differentiation"

This creates perfect, semantic claims from button clicks. Zero ambiguity.

From Claims to Constitution to Code

Once we have validated, conflict-free claims, the magic happens in three stages:

Stage 1: Constitutional Principles

An AI agent extracts the immutable principles for this specific project:

Example for a luxury spa:

### Brand Authenticity
- Serene, calming aesthetic is non-negotiable
- Luxury positioning must be evident in every touchpoint  
- Avoid clinical/medical imagery despite wellness focus

Rationale: Premium pricing requires premium presentation

These become the guardrails for all future decisions. Unlike generic best practices, these are specific to THIS business.

Stage 2: Specification

Another AI agent generates a complete specification following software engineering practices:

User stories with priorities (P1, P2, P3)
Functional requirements (numbered, testable)
Success criteria (measurable outcomes)
Edge cases (what if form fails? What if no testimonials yet?)

Example snippet:

### User Story 1 - Book Spa Service (Priority: P1) 🎯 MVP

Potential client can view available services and book appointments online.

**Acceptance Scenarios**:
1. **Given** visitor views Services page, **When** they select a treatment, **Then** they see pricing, duration, and booking button
2. **Given** visitor clicks "Book Now", **When** they select date/time, **Then** they see available slots and can confirm

All derived from claims. All traceable. All measurable.

Stage 3: Implementation Plan

A third AI agent maps the spec to concrete components and pages:

"Contact form with service selection" → ContactForm component with specific fields
"Showcase treatments" → ServiceGrid component, 3-column layout, card variant
"Luxury aesthetic" → Design tokens: #1A1A1A, serif typography, generous spacing

The plan includes:

Exact page layouts (section by section)
Component specifications (props, content, styling)
Design system (colors, typography, spacing scales)
Build phases (MVP first, then enhancements)
Quality gates (performance, accessibility, SEO)

The Repository That Builds Itself

At the end of this pipeline, we don't just have documents. We programmatically create a GitHub repository with everything pre-configured:

✅ Constitution, spec, and plan as markdown files
✅ Kilocode/Claude AI workflows for implementation
✅ Design tokens auto-generated from constitutional principles
✅ Vercel deployment pipeline configured
✅ Accessibility and performance CI checks
✅ README with quick-start instructions

Then we trigger Kilocode/Claude with a simple implementation command.

The AI reads the constitution (guardrails), spec (requirements), and plan (component mapping), then writes the actual website code—honoring every principle, meeting every requirement, following every design decision.

Why This Matters: The Engineering Perspective

1. Traceability

Every component in the final website traces back to a specific claim, which traces back to specific client input:

"Why is the tagline so prominent on the homepage?"
→ Requirement FR-003: Tagline must be visible
→ Constitutional Principle: Brand Authenticity
→ Claim #7: "Our tagline is our differentiator"
→ Original input: "People remember our slogan 'We Knead You'"

Can't get that from a traditional questionnaire.

2. Auditability

We can diff claim sets between submissions. If a client comes back 6 months later wanting changes, we can see exactly what changed:

- Claim #23: Target audience is individual consumers
+ Claim #23: Target audience is wholesale restaurant buyers

And regenerate specs accordingly.

3. No Data Loss

Traditional intake: "Please describe your services"
Client writes 400 words.
Developer summarizes to 3 bullet points.
Information lost forever.

Our system: Every word becomes a claim. Every claim is preserved. Nothing gets summarized away.

4. Conflict Detection That Scales

With 5-10 requirements, humans can spot conflicts. With 60-70 atomic claims across brand, audience, features, content, design, and constraints?

You need AI to find:

The subtle contradiction between "budget-friendly" (Claim #8) and "premium materials" (Claim #34)
The ambiguity in "fast turnaround" (Claim #19) with no definition of "fast"
The missing connection between "target busy professionals" (Claim #5) and 9-5 business hours (Claim #28)

Real-World Impact: The Artist Portfolio Example

Let's say a ceramic artist fills out the intake:

Initial input (conversational):

"I make functional pottery—bowls, mugs, vases. Each piece is unique. I want people to see my work and contact me for commissions. I'm inspired by Japanese minimalism but with warmer colors. My customers are interior designers and people furnishing new homes."

Our system extracts:

8 claims about offerings (functional pottery, custom commissions)
4 claims about audience (designers, homeowners, furniture-timing)
6 claims about aesthetic (Japanese minimalism, warm colors, unique pieces)
3 claims about goals (showcase work, generate commissions)

Conflict Detector finds:

✅ No conflicts
⚠️ Ambiguity: "Japanese minimalism + warm colors" (traditional Japanese minimalism uses cool, muted tones)
⚠️ Gap: No pricing strategy mentioned (commission inquiries need pricing context)

Clarification sent:

Q1: Japanese minimalism typically uses cool, muted tones. You mentioned "warmer colors"—which direction should we prioritize?

A) Traditional Japanese aesthetic (cool grays, muted earth tones)
B) Warm minimalism (ochre, terracotta, warm beige)
C) Mix both (Japanese forms, warm color accents)

Client picks B. System generates:

Constitutional Principle:

### Warm Minimalism
- Color palette must use warm earth tones (ochre, terracotta, warm beige)
- Design must follow minimalist principles (negative space, clean lines)
- Avoid traditional cool Japanese palette
- Each piece must be showcased individually (no grid overwhelm)

Rationale: Differentiation from traditional ceramics galleries

Specification:

### User Story 1 - View Artwork Gallery (Priority: P1)

Potential clients can browse ceramic pieces with high-quality imagery and understand each piece's uniqueness.

**Acceptance Scenarios**:
1. **Given** visitor lands on portfolio, **When** they browse, **Then** they see large images with piece details
2. **Given** visitor clicks a piece, **When** detail view opens, **Then** they see dimensions, materials, availability

**Success Criteria**:
- Visitors can view any piece in under 10 seconds
- 100% of pieces have high-quality photos (min 1200px width)
- Mobile visitors can swipe through gallery smoothly

All from a 3-sentence description and one clarifying question.

1. Dual-Track Processing

We don't send everything to AI. We split the data:

Data Type	Processing	Example
Structured	Code (instant)	Email addresses, phone numbers, page selections
Unstructured	AI (thoughtful)	Brand personality descriptions, service explanations

Why? AI costs $0.002 per 1K tokens. Processing 1,000 email addresses through AI = $2. Processing through regex = $0.00. When you're handling hundreds of intake forms, this matters.

2. Claim Templates

When we ask clarifying questions, we don't just collect answers—we embed claim templates that specify exactly how the answer becomes a requirement.

Question: "What level of booking functionality?"
Template:

{
  "text_pattern": "The booking system should use {answer}",
  "category": "feature",
  "answer_mapping": {
    "OPTION_A": "a simple request form where customers suggest dates and staff confirms manually",
    "OPTION_B": "a calendar interface where customers select from available time slots"
  }
}

When client clicks Option B, the system doesn't just store "B"—it generates:

"The booking system should use a calendar interface where customers select from available time slots"

Perfect semantic claim. Zero parsing ambiguity.

3. Constitutional Thinking

We borrowed this from constitutional AI research: Have a set of immutable principles that guide all decisions.

For websites, this means:

Brand principles (what's sacred about the brand?)
UX principles (who are we optimizing for?)
Design principles (what aesthetic is non-negotiable?)
Technical constraints (what must work, by when?)

Plus universal standards (accessibility, performance, security, SEO, mobile).

Every AI agent in the pipeline checks its output against the constitution. If a suggestion violates a principle, it's rejected or flagged for review.

4. Progressive Refinement

The clarification loop is iterative but bounded:

Round 1: Ask critical questions (2-3 max)
Auto-apply fallbacks to medium-priority ambiguities (with approval)
Round 2 (if needed): Only if critical issues remain
Max 3 rounds: After that, proceed with best-effort defaults

Each round makes claims more specific. But we never ask the same thing twice, and we always show our reasoning when applying defaults.

The Output: A Repository Ready to Build

Here's what comes out the other end:

`constitution.md`

# Sunrise Bakery - Website Constitution

## Brand Principles

### Artisanal Authenticity  
- Family recipes and scratch-made process must be highlighted
- Cozy, welcoming personality is non-negotiable
- Must differentiate from chain bakery sterility

### Personal Connection
- Customer recognition and personalization are core values
- Website must feel warm and human, not corporate

`spec.md`

# Website Specification: Sunrise Bakery

## User Scenarios & Goals

### User Story 1 - View Daily Menu (Priority: P1)

Customers can see what's available today and place orders.

**Acceptance Scenarios**:
1. **Given** customer visits site, **When** they view menu, **Then** they see today's offerings with photos and prices

`plan.md`

# Implementation Plan: Sunrise Bakery

## Page-by-Page Breakdown

### Home Page

**Layout**:
- Hero: Warm photo of fresh bread, headline, daily special
- Today's Menu: 3-column grid of current offerings
- Story section: Family history, scratch-made promise
- Location/Hours: Map, contact info, order CTA

**Components**:
- HeroWithImage: Bakery photo, "Fresh Daily Since 1995" headline
- MenuGrid: Dynamic daily offerings (can update via CMS)
- StoryBlock: Family photo, 2-paragraph origin story
- ContactCTA: "Order Now" button + phone display

Plus:

Design system (colors: warm browns #8B4513, cream #F5E6D3)
Build phases (MVP = menu + contact, Phase 2 = online ordering)
Quality gates (performance, accessibility benchmarks)

All generated from the intake conversation.

GitHub + Vercel: Deployment on Rails

The final step: We programmatically create a GitHub repository with:

All three markdown files (constitution, spec, plan)
Next.js template with shadcn/ui components
Design tokens auto-generated from constitutional principles
Vercel configuration (auto-deploy staging site on push)
GitHub Actions for quality checks (Lighthouse CI, accessibility audit)
Kilocode workflows that can implement the spec
Production website build and deploy pipeline

Trigger implementation.

AI reads the spec, honors the constitution, builds the site, commits the code, and deploys to a preview URL.

From chat to deployed website with minimal human intervention.

Why This Approach Wins

For Clients:

✅ Natural conversation, not checkbox hell
✅ Fast clarifications (approve our decisions or override)
✅ No lost context ("wait, what did we say about colors?")
✅ Transparent process (see exactly what we understood)

For Developers:

✅ Clean, validated requirements (no ambiguous descriptions)
✅ Traceable decisions (every requirement has provenance)
✅ Constitutional guardrails (AI can't ignore key constraints)
✅ Ready-to-implement specs (no interpretation phase)

For the Business:

✅ Scalable (handle 10x more projects with same team)
✅ Quality (AI catches conflicts humans miss)
✅ Consistency (constitutional principles ensure brand alignment)
✅ Speed (intake to preview in days, not weeks)

The Future: Specification as Infrastructure

We're moving toward a world where requirements are code. Not metaphorically—literally.

Your intake conversation generates:

constitution.md (immutable principles)
spec.md (versioned requirements)
plan.md (component contracts)

These live in Git. They have version history. They can be diffed, merged, and branched. When requirements change, you update the spec, and the system regenerates only what changed.

This is spec-driven development for websites. And it's possible because we stopped treating client input as "the answer" and started treating it as evidence in a requirements compiler.

We're Still Iterating

This system represents months of exploration:

Failed attempt #1: "Just use GPT to parse the form" (hallucinated requirements)
Failed attempt #2: "Chain multiple prompts" (lost context between calls)
Failed attempt #3: "Let AI resolve conflicts" (made assumptions without asking)

What works: AI for interpretation, code for validation, templates for structure, and a clear separation between evidence, intent, decisions, and artifacts.

If you're building anything that takes human input and generates structured output—whether it's websites, legal contracts, or technical specs—the core insight applies:

Don't trust the AI. Don't trust the human. Build a system that validates both.

Want to see this in action? Try our intake chatbot and watch your requirements transform from conversation to constitution in no time

Keywords: AI website development, automated requirements gathering, chatbot intake forms, AI-powered web design, specification generation, constitutional AI, requirements compiler, web development automation, AI planning tools, automated website specifications