HungerSync build — Grounded discovery over a vendor knowledge base
HungerSync build — Grounded discovery over a vendor knowledge base
Section titled “HungerSync build — Grounded discovery over a vendor knowledge base”All code below is original HungerSync work using the public Bedrock APIs. The AWS lab is copyrighted and is not reproduced here.
The concept (provider-neutral)
Section titled “The concept (provider-neutral)”Retrieval-Augmented Generation: convert a question to a vector, retrieve the most relevant chunks from a knowledge base (semantic or hybrid search), augment the prompt with those chunks plus source attribution, and generate a grounded answer. Two postures: a managed retrieve-then-generate call (fast to stand up), and a manual retrieve-then-augment-then-generate loop (full control to filter before generation). Then evaluate with faithfulness / relevancy / context precision-recall, and guardrail unsafe content.
The HungerSync problem (world-neutral)
Section titled “The HungerSync problem (world-neutral)”The discovery assistant must answer “what can I eat near gate B12 that’s vegan and ready in under 10 minutes?” using only current, ground-truth facts — vendor menus, availability, location, dietary tags — never inventing an item or a vendor, and never recommending something unsafe for a stated allergy. The failure modes are stale recommendations (item is 86’d) and hallucinated ones; grounding + evaluation + a safety gate are the fix. This is exactly why the manual posture matters here: we must filter by availability and allergen safety before the model speaks.
The knowledge base
Section titled “The knowledge base”Documents: one per vendor with menu items; a gate/terminal walk-time sheet; a
dietary/allergen reference. Each chunk carries metadata: vendor, gate, concourse,
dietary_tags, allergens, available (bool), prep_minutes. Metadata is what lets us
filter retrieved chunks to available + safe before generating.
Pattern A — managed retrieve-and-generate (gate concierge)
Section titled “Pattern A — managed retrieve-and-generate (gate concierge)”import boto3from botocore.config import Config
region = boto3.session.Session().region_nameagent = boto3.client("bedrock-agent-runtime", config=Config(read_timeout=120, retries={"max_attempts": 2}))
GATE_CONCIERGE_PROMPT = """You are HungerSync's gate concierge. Use ONLY the searchresults to recommend food a delayed passenger can get quickly. Rules:- Recommend an item ONLY if the search results show it is available right now.- If the passenger states a dietary need or allergy, recommend only items the results confirm are safe; if you are not certain an item is safe, do not recommend it.- Cite the vendor and gate for every item. If nothing qualifies, say so plainly.Search results:$search_results$$output_format_instructions$"""
def concierge_answer(question, kb_id, model_arn, k=5): resp = agent.retrieve_and_generate( input={"text": question}, retrieveAndGenerateConfiguration={ "type": "KNOWLEDGE_BASE", "knowledgeBaseConfiguration": { "knowledgeBaseId": kb_id, "modelArn": model_arn, "retrievalConfiguration": { "vectorSearchConfiguration": {"numberOfResults": k} }, "generationConfiguration": { "promptTemplate": {"textPromptTemplate": GATE_CONCIERGE_PROMPT} }, }, }, ) return resp["output"]["text"]Pattern B — manual retrieve, filter, then generate (the safe path)
Section titled “Pattern B — manual retrieve, filter, then generate (the safe path)”import json
def retrieve_chunks(question, kb_id, k=8, search="HYBRID"): r = agent.retrieve( retrievalQuery={"text": question}, knowledgeBaseId=kb_id, retrievalConfiguration={"vectorSearchConfiguration": {"numberOfResults": k, "overrideSearchType": search}}, ) return r["retrievalResults"]
def keep_safe_and_available(chunks, allergens_to_avoid): """Filter retrieved chunks to available items with no disqualifying allergen. Anything whose metadata can't confirm safety is dropped (fail-closed).""" safe = [] for c in chunks: md = c.get("metadata", {}) if not md.get("available", False): continue item_allergens = set(md.get("allergens", [])) if item_allergens & set(allergens_to_avoid): continue safe.append(c) return safe
def grounded_answer(question, kb_id, bedrock_runtime, model_id, allergens=()): chunks = retrieve_chunks(question, kb_id) safe = keep_safe_and_available(chunks, allergens) if not safe: return "I can't confirm anything available and safe for that right now." context = "\n\n".join( f"[{c['metadata'].get('vendor')} @ gate {c['metadata'].get('gate')}, " f"~{c['metadata'].get('prep_minutes')} min] {c['content']['text']}" for c in safe ) prompt = ( "You are HungerSync's gate concierge. Recommend 1-3 items using ONLY the context. " "Cite vendor and gate. If unsure an item is safe or available, omit it.\n\n" f"Context:\n{context}\n\nQuestion: {question}" ) body = {"messages": [{"role": "user", "content": [{"text": prompt}]}]} resp = bedrock_runtime.converse(modelId=model_id, messages=body["messages"]) return resp["output"]["message"]["content"][0]["text"]The keep_safe_and_available filter is the HungerSync-specific move the financial lab
didn’t need: it makes the retrieval layer fail-closed on safety before the model
ever sees the chunk.
Evaluation (RAGAS-style + two HungerSync gates)
Section titled “Evaluation (RAGAS-style + two HungerSync gates)”# Standard RAG metrics on a HungerSync test set (question, answer, contexts, ground_truth):# faithfulness -> did the answer stay grounded in retrieved menus? (no invented items)# answer_relevancy -> did it address the passenger's actual ask?# context_precision -> were retrieved chunks on-topic?# context_recall -> did we retrieve the items that should have qualified?## Plus two HungerSync-specific pass/fail gates that must hit 100% before go-live:def availability_accuracy(recommendations, live_menu): return all(live_menu.get(item, {}).get("available") for item in recommendations)
def allergen_safety(recommendations, live_menu, avoid): return all(not (set(live_menu[item]["allergens"]) & set(avoid)) for item in recommendations)Faithfulness and the two gates are the metrics that matter most here: a fluent answer that recommends an 86’d or unsafe item is a failing answer regardless of relevancy.
Guardrail note
Section titled “Guardrail note”A content/safety guardrail (allergen + alcohol-to-minors + injection defense) wraps this at the application edge; its full build lives in CS8 — Trust & Safety. Here we cover only the retrieval-side fail-closed filter.
Implementation on AWS (playthrough overlay — only place vendor names appear)
Section titled “Implementation on AWS (playthrough overlay — only place vendor names appear)”- Amazon Bedrock Knowledge Bases over OpenSearch Serverless as the vector store; chunking + embeddings managed by the KB.
- Retrieve / RetrieveAndGenerate APIs (
bedrock-agent-runtime); Converse API for the manual-path generation + token metrics. - Amazon Nova Lite (fast/cheap live path) or Claude for hard cases.
- RAGAS via LangChain for the metric harness; Bedrock Guardrails for the CS8 edge.
- Re-skinnable: swap the KB/vector store/FM for a GCP/Azure/Databricks equivalent and the neutral problem above is unchanged.
Maps to
Section titled “Maps to”- Task statements: AIP-1.4 (vector store), AIP-1.5 (retrieval mechanisms), AIP-5.1 (RAG evaluation); touches AIP-3.1 (guardrails → full build in CS8).
- Feeds case: CS4 — Discovery you can trust. Reuses exhibit: customer journey.