AcademyKnowledge base

HungerSync build — Grounded discovery over a vendor knowledge base

AIP-C01 / AIP-1.4, 1.5, 5.1 — RAG with Bedrock Knowledge Bases, including a fail-closed safety filter for allergens and availability. Feeds CS4.

All code below is original HungerSync work using the public Bedrock APIs. The AWS lab is copyrighted and is not reproduced here.

The concept (provider-neutral)

Retrieval-Augmented Generation: convert a question to a vector, retrieve the most relevant chunks from a knowledge base (semantic or hybrid search), augment the prompt with those chunks plus source attribution, and generate a grounded answer. Two postures: a managed retrieve-then-generate call (fast to stand up), and a manual retrieve-then-augment-then-generate loop (full control to filter before generation). Then evaluate with faithfulness / relevancy / context precision-recall, and guardrail unsafe content.

The HungerSync problem (world-neutral)

The discovery assistant must answer “what can I eat near gate B12 that’s vegan and ready in under 10 minutes?” using only current, ground-truth facts — vendor menus, availability, location, dietary tags — never inventing an item or a vendor, and never recommending something unsafe for a stated allergy. The failure modes are stale recommendations (item is 86’d) and hallucinated ones; grounding + evaluation + a safety gate are the fix. This is exactly why the manual posture matters here: we must filter by availability and allergen safety before the model speaks.

The knowledge base

Documents: one per vendor with menu items; a gate/terminal walk-time sheet; a dietary/allergen reference. Each chunk carries metadata: vendor, gate, concourse, dietary_tags, allergens, available (bool), prep_minutes. Metadata is what lets us filter retrieved chunks to available + safe before generating.

Pattern A — managed retrieve-and-generate (gate concierge)

import boto3
from botocore.config import Config

region = boto3.session.Session().region_name
agent = boto3.client("bedrock-agent-runtime",
                     config=Config(read_timeout=120, retries={"max_attempts": 2}))

GATE_CONCIERGE_PROMPT = """You are HungerSync's gate concierge. Use ONLY the search
results to recommend food a delayed passenger can get quickly. Rules:
- Recommend an item ONLY if the search results show it is available right now.
- If the passenger states a dietary need or allergy, recommend only items the results
  confirm are safe; if you are not certain an item is safe, do not recommend it.
- Cite the vendor and gate for every item. If nothing qualifies, say so plainly.
Search results:
$search_results$
$output_format_instructions$"""

def concierge_answer(question, kb_id, model_arn, k=5):
    resp = agent.retrieve_and_generate(
        input={"text": question},
        retrieveAndGenerateConfiguration={
            "type": "KNOWLEDGE_BASE",
            "knowledgeBaseConfiguration": {
                "knowledgeBaseId": kb_id,
                "modelArn": model_arn,
                "retrievalConfiguration": {
                    "vectorSearchConfiguration": {"numberOfResults": k}
                },
                "generationConfiguration": {
                    "promptTemplate": {"textPromptTemplate": GATE_CONCIERGE_PROMPT}
                },
            },
        },
    )
    return resp["output"]["text"]

Pattern B — manual retrieve, filter, then generate (the safe path)

import json

def retrieve_chunks(question, kb_id, k=8, search="HYBRID"):
    r = agent.retrieve(
        retrievalQuery={"text": question},
        knowledgeBaseId=kb_id,
        retrievalConfiguration={"vectorSearchConfiguration":
                                {"numberOfResults": k, "overrideSearchType": search}},
    )
    return r["retrievalResults"]

def keep_safe_and_available(chunks, allergens_to_avoid):
    """Filter retrieved chunks to available items with no disqualifying allergen.
    Anything whose metadata can't confirm safety is dropped (fail-closed)."""
    safe = []
    for c in chunks:
        md = c.get("metadata", {})
        if not md.get("available", False):
            continue
        item_allergens = set(md.get("allergens", []))
        if item_allergens & set(allergens_to_avoid):
            continue
        safe.append(c)
    return safe

def grounded_answer(question, kb_id, bedrock_runtime, model_id, allergens=()):
    chunks = retrieve_chunks(question, kb_id)
    safe = keep_safe_and_available(chunks, allergens)
    if not safe:
        return "I can't confirm anything available and safe for that right now."
    context = "\n\n".join(
        f"[{c['metadata'].get('vendor')} @ gate {c['metadata'].get('gate')}, "
        f"~{c['metadata'].get('prep_minutes')} min] {c['content']['text']}"
        for c in safe
    )
    prompt = (
        "You are HungerSync's gate concierge. Recommend 1-3 items using ONLY the context. "
        "Cite vendor and gate. If unsure an item is safe or available, omit it.\n\n"
        f"Context:\n{context}\n\nQuestion: {question}"
    )
    body = {"messages": [{"role": "user", "content": [{"text": prompt}]}]}
    resp = bedrock_runtime.converse(modelId=model_id,
                                    messages=body["messages"])
    return resp["output"]["message"]["content"][0]["text"]

The keep_safe_and_available filter is the HungerSync-specific move the financial lab didn’t need: it makes the retrieval layer fail-closed on safety before the model ever sees the chunk.

Evaluation (RAGAS-style + two HungerSync gates)

# Standard RAG metrics on a HungerSync test set (question, answer, contexts, ground_truth):
#   faithfulness        -> did the answer stay grounded in retrieved menus? (no invented items)
#   answer_relevancy    -> did it address the passenger's actual ask?
#   context_precision   -> were retrieved chunks on-topic?
#   context_recall      -> did we retrieve the items that should have qualified?
#
# Plus two HungerSync-specific pass/fail gates that must hit 100% before go-live:
def availability_accuracy(recommendations, live_menu):
    return all(live_menu.get(item, {}).get("available") for item in recommendations)

def allergen_safety(recommendations, live_menu, avoid):
    return all(not (set(live_menu[item]["allergens"]) & set(avoid))
               for item in recommendations)

Faithfulness and the two gates are the metrics that matter most here: a fluent answer that recommends an 86’d or unsafe item is a failing answer regardless of relevancy.

Guardrail note

A content/safety guardrail (allergen + alcohol-to-minors + injection defense) wraps this at the application edge; its full build lives in CS8 — Trust & Safety. Here we cover only the retrieval-side fail-closed filter.

Implementation on AWS (playthrough overlay — only place vendor names appear)

Amazon Bedrock Knowledge Bases over OpenSearch Serverless as the vector store; chunking + embeddings managed by the KB.
Retrieve / RetrieveAndGenerate APIs (bedrock-agent-runtime); Converse API for the manual-path generation + token metrics.
Amazon Nova Lite (fast/cheap live path) or Claude for hard cases.
RAGAS via LangChain for the metric harness; Bedrock Guardrails for the CS8 edge.
Re-skinnable: swap the KB/vector store/FM for a GCP/Azure/Databricks equivalent and the neutral problem above is unchanged.

Maps to

Task statements: AIP-1.4 (vector store), AIP-1.5 (retrieval mechanisms), AIP-5.1 (RAG evaluation); touches AIP-3.1 (guardrails → full build in CS8).
Feeds case: CS4 — Discovery you can trust. Reuses exhibit: customer journey.