AcademyKnowledge base

HungerSync build — Scoping & designing the discovery-assistant PoC

AIP-C01 / AIP-1.1 — applying the six-phase scoping methodology to HungerSync's discovery-and-ordering assistant. Feeds CS1.

The concept (provider-neutral)

Before committing to a generative-AI build, scope it as a time-boxed proof of concept: confirm the problem is a good fit for an FM, inventory the data, define go/no-go success metrics, address responsible-AI risks, time-box the work, and align stakeholders. Then select a model by requirements (modalities, accuracy, latency, cost) via comparative evaluation on your data, and justify it with a cost/value case.

The HungerSync problem (world-neutral)

Before HungerSync commits to the Peachtree pilot, it must prove that a conversational discovery-and-ordering assistant can get a stranded passenger fed faster and cheaper than today’s options (walk the concourse, or redeem a paper airline voucher at a counter) — without ever recommending an item that’s unavailable or unsafe for a stated allergy. This PoC validates feasibility before the contract clock starts.

Applying the six-phase scoping to HungerSync

1 — Problem fit. Conversational customer interaction + process automation: a strong FM fit. The guardrail: allergen safety and “is it actually available right now” are accuracy-critical, so the PoC keeps a human-checkable safety layer rather than fully autonomous high-stakes decisions.

2 — Data. PoC corpus: vendor menus (messy free-text), a gate/terminal walk-time map, a dietary/allergen reference, and a sample of historical orders by route/time. No model training needed for the PoC; this is retrieval + prompting over current facts. (Menu-text cleanup is where light fine-tuning may later live — see ML2.)

3 — Success metrics (go/no-go). Time-to-resolution vs. the status-quo baseline; availability accuracy (recommended items are actually orderable); allergen-safety pass rate (target: zero unsafe recommendations on the test set); passenger CSAT; cost per resolved order. Go only if safety is perfect and time/cost beat the baseline.

4 — Responsible AI. Allergen caution (bias toward refusing when unsure), no alcohol to minors, aggregate-first privacy (no individual PII in the PoC), and transparency (every recommendation cites the menu source it came from).

5 — Time-box. 2–8 weeks; weekly deliverables (wk1 corpus + KB; wk2 retrieval baseline; wk3 grounded answers + citations; wk4 eval harness + safety gate; later weeks hardening). Evaluation checkpoint each week.

6 — Stakeholders. Peachtree (airline wedge), Skyport (concessionaire data), the airport authority (gatekeeper), and representative passengers for CSAT.

Model selection

Requirements: text in/out now (vision later for an allergy-card photo); low latency (gate-side, anxious passenger); tight cost target (thin per-order margin). Run a comparative evaluation on a HungerSync test set of real passenger questions with ground-truth “correct, available, safe” answers — not public benchmarks. Favor a fast/cheap model for the live path, reserving a stronger model for hard cases.

COSTAR prompt (worked, for the assistant)

Context: passenger is at a specific gate during a delay; only current menus apply.
Objective: recommend 1–3 items that are available now and safe for stated needs.
Steps: check dietary/allergen constraints first; then availability; then proximity/prep-time.
Tone: brief, calm, concrete.
Audience: a hungry, time-pressured traveler.
Response: item, vendor, gate, prep-time, and the menu source cited; refuse if unsure.

(COSTAR = Context, Objective, Steps, Tone, Audience, Response — the actionable variant of the “Context/Observation/Specifics/Target/Action/Result” framing in the module notes.)

Cost / value

TCO vs. status quo (runner labor + abandoned-sale loss); cost per resolved order (token + retrieval); break-even on pilot volume. Strategic value: the airline voucher rail and the data exhaust that feeds the prediction layer.

Implementation on AWS (playthrough overlay — only place vendor names appear)

PoC on Amazon Bedrock; Bedrock Model Evaluation for the comparative model pick on the HungerSync test set.
IAM least-privilege roles; encryption in transit/at rest; CloudTrail + CloudWatch for API/usage/latency logging and cost visibility (Cost Explorer).
Bedrock Converse API token metrics for the cost-per-order math.

Maps to

Task statements: AIP-1.1 (analyze requirements & design); touches AIP-1.2 (model selection), AIP-1.6 (prompt/COSTAR), AIP-4.1 (cost) — those get their full HungerSync build in ML2 / CS5 / CS9.
Feeds case: CS1 — Designing HungerSync. Reuses exhibits: BMC, value-chain.