AI Order Intake โ€” Build Plan

How it gets built with AI, what the real hours look like, and what only a human can do.

~15h
AI Agent Hours
~15h
Human Hours
~30h
Total
Sprint 3
Target
How It Gets Built
Technical Architecture
Frontend
Intake Panel
(Cloudflare Pages)
โ†’
Backend
Parse Endpoint
(Edge Function)
โ†’
AI
LLM
(Claude / GPT)
โ†“
Context
Product Catalog
+ Client History
Pipeline
Order Created
โ†’ Packing Sheet
โ†’
Finance
QB Invoice
Auto-Created
The only new layer is the parse endpoint + LLM call. Everything below (order pipeline, packing sheet, QB) already exists in the portal โ€” AI Intake just feeds into it.
Hour Breakdown โ€” What AI Does vs. What Humans Do
AI Agent
Human
Frontend UI (intake panel + draft view)
5h AI
2h
7h total
Backend parse endpoint + prompt
4h AI
2h
6h total
Prompt calibration โš ๏ธ human-only
5โ€“8h human only
5โ€“8h total
Pipeline integration โ†’ packing + QB
2h AI
1h
3h total
Testing with real emails โš ๏ธ
2h AI
3h human
5h total
Edge cases + refinement
2h AI
1.5h
3.5h total
Hours Summary
ComponentAI AgentHumanTotalNotes
Frontend UI (intake panel, draft view, inline edit, confirm bar)5h2h7hDraft view is the most complex โ€” badges, flags, disabled state logic
Backend parse endpoint + prompt construction4h2h6hInjecting catalog + history into prompt correctly takes iteration
Prompt calibrationโ€”5โ€“8h5โ€“8hโš ๏ธ The real work. Requires Brendon's actual emails to calibrate against.
Pipeline integration (intake โ†’ createOrder โ†’ packing โ†’ QB)2h1h3hReuses existing pipeline โ€” minimal new code
Testing with real emails from Brendon's clients2h3h5hโš ๏ธ Cannot skip. Synthetic examples don't catch real patterns.
Edge cases + refinement2h1.5h3.5hโ€”
TOTAL~15h~15h~30h~2.5 weeks (Sprint 3) ยท includes 30% buffer throughout
What Only Humans Can Do
โš ๏ธ Prompt Engineering โ€” The Real Work
The LLM needs to reliably turn "the usual herbs" into the right product at the right quantity โ€” using the client's order history as context. Getting this right isn't code, it's calibration. It requires: writing the system prompt, injecting the product catalog correctly, testing against sample inputs, measuring accuracy, and iterating. This is human work. 4โ€“6 hours minimum, and it directly determines whether the feature saves time or creates more work.
โš ๏ธ Testing with Real Emails โ€” Non-Negotiable
The parser must be tuned against Brendon's actual client emails โ€” not synthetic examples. Clients like MW Restaurant, Safeway, and Halekulani Hotel each have their own ordering language and habits. Without testing on those real patterns, the first production run will have lower accuracy, and staff will spend more time correcting than they save. Brendon needs to provide 20โ€“30 real order emails/SMS before Sprint 3 begins.
Human review gate stays in v1
No auto-confirm in v1. Every parsed order requires human eyes before entering the pipeline. This adds ~60 seconds per order but eliminates wrong fill risk. As accuracy data accumulates, a configurable auto-confirm threshold can be introduced in Phase 2 โ€” but only after real-world performance is proven.
Where AI Intake Fits in the Timeline
Pre-Sprint
Data Ingestion
Ingest Brendon's wholesale price list + bulk import QB customer export
Now
Unlocks UAT
Sprint 1
Foundation โ€” Auth + Catalog + Onboarding
Role-based access, product CRUD, customer invite flow, bulk import
Wk 1โ€“2
~12 days
Sprint 2
Ordering + Packing + Standing Orders
Full order pipeline, packing sheets, UAT with real clients
Wk 3โ€“4
~12 days
Sprint 3
๐Ÿค– AI Order Intake โ€” This Build
Intake UI + parse endpoint + prompt engineering + testing with real emails. Requires sample emails from Brendon by start of Sprint 3.
Wk 5โ€“6
~18h total
Sprint 4
QuickBooks Integration
OAuth connect, invoice auto-creation on order confirm, customer sync
Wk 7โ€“8
~10h
The One Thing That Can Delay This
Critical dependency โ€” Brendon action required
20โ€“30 real order emails/SMS from Brendon's clients, needed before Sprint 3 starts.

These don't need to be curated or formatted โ€” raw forwarded emails work. The goal is to test the parser against the real language Brendon's clients actually use. Without these, Sprint 3 delivery shifts by ~1 week. With them, the parser ships production-ready.
What Makes This Different from a Traditional Build
Traditional AgencyThis Approach (AI-Accelerated)
Dev writes parsing logic manually with regex/rulesLLM handles natural language โ€” no rule writing
New client format = new rule = new sprintNew client format = handled automatically
Estimated 3โ€“4 weeks for this feature alone~2 weeks including calibration
~$15โ€“25K agency cost for this scope~11h human hours + AI inference cost
Hard to update parsing logic post-launchPrompt update = immediate behavior change