AI Operating Model

Working note

This page is for the durable shape of the work.

The working assumption here is that classical MLOps and newer context-heavy GenAI work still share a lifecycle, even when the tooling looks different.

What this page should eventually answer

What stays constant

Which stages remain stable across classical ML, retrieval-heavy workflows, prompt systems, and agentic systems?

What actually changes

Where do prompts, context, tools, evaluations, and provider constraints change the operating model in ways that matter?

Lifecycle slices worth documenting

Intent and framing

Problem definition, users, risks, and what “good” means.

Data and context

Source material, retrieval surfaces, freshness, and trust boundaries.

Build and evaluation

Experiment loops, evals, regression checks, and review gates.

Release and operations

Deployment, observability, rollback, and ongoing governance.

Questions worth coming back to

System questions

Which artifacts are versioned, and which are still treated as loose operational state?
Where do evaluation results become release gates rather than just diagnostics?
How do provider dependencies change the deployment and rollback model?
Which decisions live with the platform team versus the product team?