Skip to content

HungerSync build — Grounded discovery over a vendor knowledge base

HungerSync build — Grounded discovery over a vendor knowledge base

Section titled “HungerSync build — Grounded discovery over a vendor knowledge base”

All code below is original HungerSync work using the public Bedrock APIs. The AWS lab is copyrighted and is not reproduced here.

Retrieval-Augmented Generation: convert a question to a vector, retrieve the most relevant chunks from a knowledge base (semantic or hybrid search), augment the prompt with those chunks plus source attribution, and generate a grounded answer. Two postures: a managed retrieve-then-generate call (fast to stand up), and a manual retrieve-then-augment-then-generate loop (full control to filter before generation). Then evaluate with faithfulness / relevancy / context precision-recall, and guardrail unsafe content.

The discovery assistant must answer “what can I eat near gate B12 that’s vegan and ready in under 10 minutes?” using only current, ground-truth facts — vendor menus, availability, location, dietary tags — never inventing an item or a vendor, and never recommending something unsafe for a stated allergy. The failure modes are stale recommendations (item is 86’d) and hallucinated ones; grounding + evaluation + a safety gate are the fix. This is exactly why the manual posture matters here: we must filter by availability and allergen safety before the model speaks.

Documents: one per vendor with menu items; a gate/terminal walk-time sheet; a dietary/allergen reference. Each chunk carries metadata: vendor, gate, concourse, dietary_tags, allergens, available (bool), prep_minutes. Metadata is what lets us filter retrieved chunks to available + safe before generating.

Pattern A — managed retrieve-and-generate (gate concierge)

Section titled “Pattern A — managed retrieve-and-generate (gate concierge)”
import boto3
from botocore.config import Config
region = boto3.session.Session().region_name
agent = boto3.client("bedrock-agent-runtime",
config=Config(read_timeout=120, retries={"max_attempts": 2}))
GATE_CONCIERGE_PROMPT = """You are HungerSync's gate concierge. Use ONLY the search
results to recommend food a delayed passenger can get quickly. Rules:
- Recommend an item ONLY if the search results show it is available right now.
- If the passenger states a dietary need or allergy, recommend only items the results
confirm are safe; if you are not certain an item is safe, do not recommend it.
- Cite the vendor and gate for every item. If nothing qualifies, say so plainly.
Search results:
$search_results$
$output_format_instructions$"""
def concierge_answer(question, kb_id, model_arn, k=5):
resp = agent.retrieve_and_generate(
input={"text": question},
retrieveAndGenerateConfiguration={
"type": "KNOWLEDGE_BASE",
"knowledgeBaseConfiguration": {
"knowledgeBaseId": kb_id,
"modelArn": model_arn,
"retrievalConfiguration": {
"vectorSearchConfiguration": {"numberOfResults": k}
},
"generationConfiguration": {
"promptTemplate": {"textPromptTemplate": GATE_CONCIERGE_PROMPT}
},
},
},
)
return resp["output"]["text"]

Pattern B — manual retrieve, filter, then generate (the safe path)

Section titled “Pattern B — manual retrieve, filter, then generate (the safe path)”
import json
def retrieve_chunks(question, kb_id, k=8, search="HYBRID"):
r = agent.retrieve(
retrievalQuery={"text": question},
knowledgeBaseId=kb_id,
retrievalConfiguration={"vectorSearchConfiguration":
{"numberOfResults": k, "overrideSearchType": search}},
)
return r["retrievalResults"]
def keep_safe_and_available(chunks, allergens_to_avoid):
"""Filter retrieved chunks to available items with no disqualifying allergen.
Anything whose metadata can't confirm safety is dropped (fail-closed)."""
safe = []
for c in chunks:
md = c.get("metadata", {})
if not md.get("available", False):
continue
item_allergens = set(md.get("allergens", []))
if item_allergens & set(allergens_to_avoid):
continue
safe.append(c)
return safe
def grounded_answer(question, kb_id, bedrock_runtime, model_id, allergens=()):
chunks = retrieve_chunks(question, kb_id)
safe = keep_safe_and_available(chunks, allergens)
if not safe:
return "I can't confirm anything available and safe for that right now."
context = "\n\n".join(
f"[{c['metadata'].get('vendor')} @ gate {c['metadata'].get('gate')}, "
f"~{c['metadata'].get('prep_minutes')} min] {c['content']['text']}"
for c in safe
)
prompt = (
"You are HungerSync's gate concierge. Recommend 1-3 items using ONLY the context. "
"Cite vendor and gate. If unsure an item is safe or available, omit it.\n\n"
f"Context:\n{context}\n\nQuestion: {question}"
)
body = {"messages": [{"role": "user", "content": [{"text": prompt}]}]}
resp = bedrock_runtime.converse(modelId=model_id,
messages=body["messages"])
return resp["output"]["message"]["content"][0]["text"]

The keep_safe_and_available filter is the HungerSync-specific move the financial lab didn’t need: it makes the retrieval layer fail-closed on safety before the model ever sees the chunk.

Evaluation (RAGAS-style + two HungerSync gates)

Section titled “Evaluation (RAGAS-style + two HungerSync gates)”
# Standard RAG metrics on a HungerSync test set (question, answer, contexts, ground_truth):
# faithfulness -> did the answer stay grounded in retrieved menus? (no invented items)
# answer_relevancy -> did it address the passenger's actual ask?
# context_precision -> were retrieved chunks on-topic?
# context_recall -> did we retrieve the items that should have qualified?
#
# Plus two HungerSync-specific pass/fail gates that must hit 100% before go-live:
def availability_accuracy(recommendations, live_menu):
return all(live_menu.get(item, {}).get("available") for item in recommendations)
def allergen_safety(recommendations, live_menu, avoid):
return all(not (set(live_menu[item]["allergens"]) & set(avoid))
for item in recommendations)

Faithfulness and the two gates are the metrics that matter most here: a fluent answer that recommends an 86’d or unsafe item is a failing answer regardless of relevancy.

A content/safety guardrail (allergen + alcohol-to-minors + injection defense) wraps this at the application edge; its full build lives in CS8 — Trust & Safety. Here we cover only the retrieval-side fail-closed filter.

Implementation on AWS (playthrough overlay — only place vendor names appear)

Section titled “Implementation on AWS (playthrough overlay — only place vendor names appear)”
  • Amazon Bedrock Knowledge Bases over OpenSearch Serverless as the vector store; chunking + embeddings managed by the KB.
  • Retrieve / RetrieveAndGenerate APIs (bedrock-agent-runtime); Converse API for the manual-path generation + token metrics.
  • Amazon Nova Lite (fast/cheap live path) or Claude for hard cases.
  • RAGAS via LangChain for the metric harness; Bedrock Guardrails for the CS8 edge.
  • Re-skinnable: swap the KB/vector store/FM for a GCP/Azure/Databricks equivalent and the neutral problem above is unchanged.
  • Task statements: AIP-1.4 (vector store), AIP-1.5 (retrieval mechanisms), AIP-5.1 (RAG evaluation); touches AIP-3.1 (guardrails → full build in CS8).
  • Feeds case: CS4 — Discovery you can trust. Reuses exhibit: customer journey.