Mari's Gardens — AI Intake

How It Gets Built

Technical Architecture

Frontend

Intake Panel
(Cloudflare Pages)

→

Backend

Parse Endpoint
(Edge Function)

→

AI

LLM
(Claude / GPT)

↓

Context

Product Catalog
+ Client History

Pipeline

Order Created
→ Packing Sheet

→

Finance

QB Invoice
Auto-Created

The only new layer is the parse endpoint + LLM call. Everything below (order pipeline, packing sheet, QB) already exists in the portal — AI Intake just feeds into it.

Hour Breakdown — What AI Does vs. What Humans Do

AI Agent

Human

Frontend UI (intake panel + draft view)

5h AI

2h

7h total

Backend parse endpoint + prompt

4h AI

2h

6h total

Prompt calibration ⚠️ human-only

5–8h human only

5–8h total

Pipeline integration → packing + QB

2h AI

1h

3h total

Testing with real emails ⚠️

2h AI

3h human

5h total

Edge cases + refinement

2h AI

1.5h

3.5h total

Hours Summary

Component	AI Agent	Human	Total	Notes
Frontend UI (intake panel, draft view, inline edit, confirm bar)	5h	2h	7h	Draft view is the most complex — badges, flags, disabled state logic
Backend parse endpoint + prompt construction	4h	2h	6h	Injecting catalog + history into prompt correctly takes iteration
Prompt calibration	—	5–8h	5–8h	⚠️ The real work. Requires Brendon's actual emails to calibrate against.
Pipeline integration (intake → createOrder → packing → QB)	2h	1h	3h	Reuses existing pipeline — minimal new code
Testing with real emails from Brendon's clients	2h	3h	5h	⚠️ Cannot skip. Synthetic examples don't catch real patterns.
Edge cases + refinement	2h	1.5h	3.5h	—
TOTAL	~15h	~15h	~30h	~2.5 weeks (Sprint 3) · includes 30% buffer throughout

What Only Humans Can Do

⚠️ Prompt Engineering — The Real Work

The LLM needs to reliably turn "the usual herbs" into the right product at the right quantity — using the client's order history as context. Getting this right isn't code, it's calibration. It requires: writing the system prompt, injecting the product catalog correctly, testing against sample inputs, measuring accuracy, and iterating. This is human work. 4–6 hours minimum, and it directly determines whether the feature saves time or creates more work.

⚠️ Testing with Real Emails — Non-Negotiable

The parser must be tuned against Brendon's actual client emails — not synthetic examples. Clients like MW Restaurant, Safeway, and Halekulani Hotel each have their own ordering language and habits. Without testing on those real patterns, the first production run will have lower accuracy, and staff will spend more time correcting than they save. Brendon needs to provide 20–30 real order emails/SMS before Sprint 3 begins.

Human review gate stays in v1

No auto-confirm in v1. Every parsed order requires human eyes before entering the pipeline. This adds ~60 seconds per order but eliminates wrong fill risk. As accuracy data accumulates, a configurable auto-confirm threshold can be introduced in Phase 2 — but only after real-world performance is proven.

Where AI Intake Fits in the Timeline

Pre-Sprint

Data Ingestion

Ingest Brendon's wholesale price list + bulk import QB customer export

Now

Unlocks UAT

Sprint 1

Foundation — Auth + Catalog + Onboarding

Role-based access, product CRUD, customer invite flow, bulk import

Wk 1–2

~12 days

Sprint 2

Ordering + Packing + Standing Orders

Full order pipeline, packing sheets, UAT with real clients

Wk 3–4

~12 days

Sprint 3

🤖 AI Order Intake — This Build

Intake UI + parse endpoint + prompt engineering + testing with real emails. Requires sample emails from Brendon by start of Sprint 3.

Wk 5–6

~18h total

Sprint 4

QuickBooks Integration

OAuth connect, invoice auto-creation on order confirm, customer sync

Wk 7–8

~10h

The One Thing That Can Delay This

Critical dependency — Brendon action required

20–30 real order emails/SMS from Brendon's clients, needed before Sprint 3 starts.

These don't need to be curated or formatted — raw forwarded emails work. The goal is to test the parser against the real language Brendon's clients actually use. Without these, Sprint 3 delivery shifts by ~1 week. With them, the parser ships production-ready.

What Makes This Different from a Traditional Build

Traditional Agency	This Approach (AI-Accelerated)
Dev writes parsing logic manually with regex/rules	LLM handles natural language — no rule writing
New client format = new rule = new sprint	New client format = handled automatically
Estimated 3–4 weeks for this feature alone	~2 weeks including calibration
~$15–25K agency cost for this scope	~11h human hours + AI inference cost
Hard to update parsing logic post-launch	Prompt update = immediate behavior change

AI Order Intake — Build Plan