Skip to content

Part C — Databricks Prototype (1-day, completed 2026-05-06)

Status: completed (2026-05-06). Executed PRD v2.1 §C against a Databricks SQL Warehouse provided by Asurion's data engineering team. Shipped Databricks connectivity + SQL Generator + per-metric routing for 2-3 metrics, plus the §C.6.1 cost_avoided_mtd reroute path that flips a tile from Postgres to Databricks via a single YAML edit. Other Part B items (Kafka, Bronze/Silver/Gold, Mosaic AI, governance, iframe sandbox, vector DB) remain mocked or deferred per ADR-PROTO-001..005. Mirrors ADR-008 (docs/lessons-learned.md § Mocks must be opt-in, never silent fallback) into the SQL gen path: Bedrock unavailable returns 503 with application/problem+json, never a silent template fallback. The remaining demo-arc work — Prompt 6 docs + the §C.6.1 reroute path + acceptance receipts #1/#3/#4/#6/#11 + the operator-gated #12 — closed out under docs/plans/completed/part-c-demo-ready.md.

Why this plan exists

A conversation with Asurion's data engineering team unlocked Databricks SQL Warehouse access plus a hand-curated set of S3-stored schema files. This is the highest-leverage credibility win available before the next sponsor checkpoint:

  • Real data on screen beats every architectural diagram. A senior leader seeing "Cost Avoided MTD by Region" populated from actual Asurion operational data shifts the project's posture from "interesting hackathon" to "this works against our environment."
  • The SQL generator is the riskiest unproven piece of Part B. Building it against real Databricks today either validates the pattern (ship the rest of Part B with confidence) or surfaces the issue now, not in week 3.
  • Kafka adds zero demonstrable value in a 90-second demo. It costs hours that should go to Databricks integration. Defer to architecture-diagram-only annotation per ADR-PROTO-001.

The full rationale is in PRD v2.1 §C.1; this plan is the execution-tracking surface.

Prerequisites

The companion PRD v2.1 hardening plan (PRD §C.4 fail-loud rewrite, §C.4.5 model-discipline subsection, §C.5.3 boot validation, instruction-file updates in CLAUDE.md / AGENTS.md / .cursor/rules/hackathon-base.mdc) is required pre-work. Without those edits, an executing agent will silently violate ADR-008 by re-introducing template fallback as a silent default. Specifically:

  • prd-v2.1.md §C.4.2 step 4d — fail-loud contract for SQL gen on Bedrock failure
  • prd-v2.1.md §C.4.5 — Haiku 4.5 + flat schema + sqlglot dialect rules
  • prd-v2.1.md §C.5.3 — boot-time routing-config validation
  • CLAUDE.md — Reading-order entry for prd-v2.1.md (precedence rule), Tech Stack lakehouse bullet, new "SQL generation discipline" sub-section, IMPORTANT bullets
  • AGENTS.mdprd-v2.1.md row in canonical-docs table

Scope (mirrors PRD v2.1 §C.2)

IN scope (per §C.2.1)

# Item Where in Part B
C-IN-01 Databricks SQL Warehouse connectivity (Serverless Starter; auth method TBD per Q-PROTO-1) New
C-IN-02 Data dictionary loader pointing at the existing S3 schema files Subset of B.5.4
C-IN-03 SQL Generator service: Bedrock + safety layer, anchored to metrics_catalog New (referenced by B.7.1)
C-IN-04 Per-metric routing config: config/metric_routing.yaml New (extends B.5.5 / B.7.1)
C-IN-05 POST /v1/widgets/{id}/data endpoint with the routing layer wired B.8.1
C-IN-06 2-3 metrics with real Databricks data lineage Subset of B.5.4
C-IN-07 Frontend: widget renderers call /data, SourceBadge + freshness on each tile B.7.1
C-IN-08 Updated architecture diagram with Kafka annotated "Phase 2" Architecture doc
C-IN-09 Demo runbook addition (~3-minute arc on top of existing 90-second arc) New
C-IN-10 docs/whats-mocked-in-prototype.md honest accounting New

MOCKED (per §C.2.2)

# Item Why mocked
C-MOCK-01 Kafka transport Zero demo value vs setup cost; deferred to Part B
C-MOCK-02 Bronze/Silver/Gold pipeline Out of 1-day scope; existing tables are "Bronze enough"
C-MOCK-03 Vector DB for data dictionary RAG Dictionary fits in 200K context; ADR-PROTO-004
C-MOCK-04 Mosaic AI Model Serving Trained models out of 1-day scope
C-MOCK-05 Iframe sandbox for custom widgets Security hardening deferred
C-MOCK-06 Widget governance workflow Approval flows deferred
C-MOCK-07 Other 7+ metrics (synthetic) Only 2-3 metrics need real lineage to prove the pattern

OUT of scope (per §C.2.3)

  • Real source-system event integration (CRM/Claims/Telemetry feed)
  • Schema registry, Avro, dual-write pattern (Part B Week 1)
  • Multi-tenant security, encryption at rest, OAuth (Part B Week 4)
  • API Gateway + Lambda for SQL generation (collapsed into FastAPI per ADR-PROTO-003)
  • Vector DB ingest pipeline for the data dictionary (ADR-PROTO-004)

Hour-by-hour map (mirrors PRD §C.11 + prompts.md)

Hour Focus Deliverable Plan todo
0 – 1 Pre-flight with data engineer; confirm creds work from laptop One curl-equivalent query returning rows hour_0_preflight
1 – 2.5 Databricks client + settings + smoke test GET /v1/databricks/health returns 200 prompt_1_databricks_client
2.5 – 4 Dictionary loader from S3 + metrics_catalog enrichment validate_dictionary.py exits 0 prompt_2_dictionary_loader
4 – 6 SQL Generator service (Bedrock + safety layer + catalog anchor) POST /v1/widgets/sql/generate dry-run returns valid SQL prompt_3_sql_generator
6 – 7.5 Widget data resolver + per-metric routing + boot validation At least one widget renders real Databricks rows prompt_4_data_resolver
7.5 – 8.5 Frontend wiring (useWidgetData, SourceBadge, MetricInfoBadge) Live dashboard tile from real data prompt_5_frontend_wiring
8.5 – 9.5 Architecture diagram + demo runbook + whats-mocked doc + sql-generator.md Demo dry-run completes cleanly twice prompt_6_demo_runbook_docs
9.5 – 10+ Buffer + backup video recording Backup MP4 saved acceptance_dryrun_12

Cross-link: each Hour row maps to one prompt section in prompts.md. The prompts are the executable form; this plan is the tracking surface.

Acceptance criteria (mirrors PRD §C.10)

The prototype is "done" when all 12 of these pass in dry-run. Each one has a matching acceptance_dryrun_* todo above; flip those to completed only after the gate physically passes.

  • #1 — make up brings the full stack up in <60s (warm 5.62s / cold 6.02s — artifacts/part-c-demo-ready/20260506-131614/make_up_{cold,warm}.log)
  • #2 — curl http://localhost:8000/v1/databricks/health returns 200 with rows_sampled > 0 (returns 5000 — see docs/plans/completed/databricks-mock-data-and-prompt-1.md)
  • #3 — python backend/scripts/validate_dictionary.py exits 0 (2.93s, 0 errors, 50 not-seeded warnings — validate_dictionary.log)
  • #4 — POST /v1/widgets/sql/generate with dry_run=true returns valid SQL for each selected metric (4 receipts: sqlgen_claim_volume_l3_asurion.json, sqlgen_claims_by_product_l3_asurion.json, sqlgen_claim_status_mix_l3_asurion.json, sqlgen_cost_avoided_mtd.json)
  • #5 — Generated SQL passes the safety layer (25 unit tests in backend/tests/test_sql_safety.py covering SELECT-only, allowlist, LIMIT injection, forbidden DDL/DML, dialect='databricks')
  • #6 — Adversarial test: DROP TABLE-coercive data_intent is rejected with structured forbidden_construct error (live + unit receipts: gate6_live_adversarial.json + gate6_safety_violation_unit.log)
  • #7 — POST /v1/widgets/{id}/data returns real Databricks rows in <3s p95 (steady-state p50=49.2ms / p95=56.3ms across 5 cached calls — artifacts/prompt-5-frontend-wiring/20260506-120610/databricks_latency.txt; cost_avoided_mtd reroute path validated at 765ms execution — cost_avoided_mtd_databricks.json)
  • #8 — Same endpoint returns Postgres rows for at least one synthetic metric in <300ms p95 (live: 5-13ms p95 — artifacts/prompt-4-data-resolver/20260506/postgres_latency.txt)
  • #9 — Frontend dashboard renders all tiles; Databricks-backed tiles show "Source: Databricks · Last updated: Xs ago" (purple Databricks · 4s ago ▾ chip with click-to-expand SQL — artifacts/prompt-5-frontend-wiring/20260506-120610/dashboard_healthy.png)
  • #10 — Killing Databricks → graceful degradation with live_data_unavailable: true + amber SourceBadge + warn log (NOT silent MockLlm; PRD §C.10 #10 reworded). End-to-end verified on the UI: bogus token flips ONLY the Databricks-routed tile to amber Mock · live data unavailable; Postgres tiles + v1 KPI strip stay visually identical. Restoring the real token returns the tile to purple. Receipts: artifacts/prompt-5-frontend-wiring/20260506-120610/dashboard_databricks_down.png + dashboard_restored.png.
  • #11 — Demo dry-run completes the full ~3-minute arc cleanly twice in a row, including the §C.6.1 existing-tile-reroute moment (programmatic dry-run twice clean: $1.41M Postgres → YAML flip → $374,714.39 Databricks → YAML restore → $1.41M Postgres back. run1=14s, run2=14s. Receipts: dryrun_run1.log + dryrun_run2.log. Operator-driven UI rehearsal happens at demo time per the runbook's Part C arc.)
  • #12 — Backup video recorded (operator-gated, see artifacts/part-c-demo-ready/20260506-131614/E2_OPERATOR_HANDOFF.md) — every prerequisite shipped (backup-recording.md Part C steps, demo-runbook backup-recording slot pointer, .gitignore coverage). The MP4 itself requires a human driver against the running stack; the operator handoff file is the single landing page for that step.

Full risk table lives in prd-v2.1.md §C.9. The most likely-to-bite ones during execution:

  1. Databricks auth blocks Hour 0 → Get auth method confirmed before Hour 1; PAT is acceptable for prototype; have backup creds ready
  2. Data dictionary mismatches actual Databricks schemavalidate_dictionary.py runs Hour 1 and exits non-zero on any mismatch
  3. Bedrock generates invalid SQL → Pre-validate the 2-3 demo queries Hour 6; if generation is unreliable, opt the metric into template_fallback (visible amber SourceBadge, NOT silent)
  4. Demo query latency exceeds 5s on Serverless Starter cold-start → Pre-warm warehouse Hour 7; cache demo queries with cache_seconds=300
  5. Token expiration kills demo mid-run → Long-lived PAT; document expiry in env config

Build discipline call-outs (non-negotiable)

Five rules each linked to its source. Skipping any one of these is the sound of a lesson being paid for a second time.

  1. SQL gen Bedrock failure → 503, never silent template fallback. Mirror of ADR-008 for the Clarifier. Source: PRD §C.4.2 step 4d, §C.4.5; docs/lessons-learned.md § Mocks must be opt-in, never silent fallback.
  2. Tool-input schema is flat object only{ sql, tables_used, explanation }. No top-level oneOf. Source: lessons-learned § Bedrock tool-use rejects top-level oneOf schemas.
  3. sqlglot dialect explicit on every parsesqlglot.parse_one(sql, dialect='databricks'). Source: prompts.md § Common failure modes — sqlglot rejects valid SQL because of dialect.
  4. Re-run make up after every backend or frontend source change + verify env vars with docker exec api env | grep DATABRICKS_. Source: lessons-learned § Stale containers hide UI work, § Watch the live env vars on make up.
  5. Real-credential smoke test for any Bedrock-routed feature — mock-only tests cannot catch tool-use access regressions. Source: lessons-learned § Bedrock tool-use rejects top-level oneOf schemas (recommended-mitigation paragraph).
  6. Local data dictionary is the single source of truth — never edit, never bypass. The four CSVs + ai_query_guidelines.md under data-dictionary/ (mounted into the api container at /app/data-dictionary) drive (a) Pydantic models in app.sql_gen.data_dictionary, (b) DDL generation for the l3_asurion seeder, © validate-dictionary structural and live passes, (d) the SQL generator's prompt-time table/join allowlist. Adding a metric or a table means adding a row in the CSVs, not hardcoding in Python. Source: docs/plans/active/prompt-2-dictionary-loader.md.
  7. metrics_catalog DDL has TWO authoritative locations. db/init.sql (fresh-DB path on make demo-reset) AND backend/app/metrics/catalog.py::_TABLE_DDL (idempotent migration via ensure_metrics_table in lifespan). Edits land in BOTH or fresh DBs and dev DBs diverge. No ALTER TABLE migration shim — edit the CREATE TABLE and require make demo-reset. Source: docs/plans/active/promote-metric-direction-to-catalog.md ddl_extend; docs/plans/active/prompt-2-dictionary-loader.md.
  8. Boot-time routing validator is fail-loud. Every metrics_catalog.name must have an entry in config/metric_routing.yaml; missing entries raise RuntimeError in app.main.lifespan and the api container exits non-zero. Reverse-direction (extra YAML rows without a catalog match) is a WARN, not fatal — staging routing for an upcoming seeder is fine. Source: PRD v2.1 §C.5.3, ADR-PROTO-005.

Time-box discipline (from prompts.md)

If you're 30 minutes over on any prompt, stop and assess rather than push through:

Prompt Budget If overrun, cut
1 (Databricks client) 90 min Skip OAuth M2M and Service Principal; PAT-only
2 (Dictionary loader) 90 min Skip S3 loader; hand-write config/data_dictionary.yaml for the 5 tables
3 (SQL Generator) 120 min Skip free-text rejection in route layer (assume only catalog-anchored calls)
4 (Data resolver) 90 min Skip cache; every request hits the backend
5 (Frontend wiring) 60 min Skip MetricInfoBadge enhancements; just SourceBadge
6 (Demo + docs) 60 min Skip whats-mocked doc — write it after the demo

Non-negotiables: working /v1/databricks/health, working SQL generation for 2 demo metrics, dashboard tile populated from real data end-to-end. Everything else can be smaller or deferred.

Definitely not in scope

Don't even start any of these mid-prototype:

  • Kafka producer / consumer code (ADR-PROTO-001 — diagram only)
  • Bronze/Silver/Gold DLT pipelines (deferred to Part B)
  • Vector DB for the data dictionary (ADR-PROTO-004)
  • Mosaic AI Model Serving endpoints (deferred to Part B)
  • Iframe sandbox for custom widgets (deferred to Part B)
  • Widget governance workflow / approval UI (deferred to Part B)
  • API Gateway + Lambda for SQL gen (ADR-PROTO-003 — collapsed into FastAPI)
  • Re-routing more than 2-3 metrics to Databricks (the "everything else stays synthetic" line is the demo)

Done when

  • All 12 acceptance checkboxes above are green
  • make verify passes (extended with the new acceptance criteria where applicable)
  • Demo dry-run twice clean, including the §C.6.1 existing-tile-reroute moment
  • Backup video recorded
  • This plan moves to docs/plans/completed/part-c-databricks-prototype.md with status: completed and completed_on: <date>
  • Lessons surfaced via harvest_lessons are appended to docs/lessons-learned.md using the four-field format
  • docs/sql-generator.md exists (created in Prompt 6) and is referenced from AGENTS.md canonical-docs table

Close-out (2026-05-06)

Prompt 5 shipped under docs/plans/completed/prompt-5-frontend-wiring.md. The remaining demo close-out — Prompt 6 docs, the §C.6.1 live cost_avoided_mtd reroute path (seeder + dictionary + catalog source_query), receipts for gates #1/#3/#4/#6/#11/#12, and the plan move to completed/ — was tracked under part-c-demo-ready.md and shipped on the same date.

Final close-out (2026-05-06). All 12 PRD §C.10 acceptance gates green (gate #12 operator-gated per artifacts/part-c-demo-ready/20260506-131614/E2_OPERATOR_HANDOFF.md; every prerequisite landed). 117 backend tests + 74 frontend tests passing. The §C.6.1 live reroute path proven twice clean programmatically. Demo runbook + README + architecture diagram + lessons-learned all updated. Plan moved to docs/plans/completed/part-c-databricks-prototype.md in the same commit as the part-c-demo-ready close-out. CLAUDE.md "Active plans" line updated to drop Part C; "Current State" Part C bullet rewritten to reflect demo-ready status.

References

  • prd-v2.1.md §C — Part C spec (the source of truth)
  • prompts.md — hour-by-hour Claude Code prompts (the executable form)
  • prd.md §19 ADR-008 — mocks-as-opt-in / fail-loud (the discipline this plan extends to SQL gen)
  • docs/lessons-learned.md:
    • § Mocks must be opt-in, never silent fallback — the discipline that locks the §C.4.2 fail-loud rewrite
    • § Bedrock tool-use rejects top-level oneOf schemas — locks the flat tool-input schema for SQL gen
    • § Haiku 4.5 silently drops deeply-nested fields — locks the { sql, tables_used, explanation } shape
    • § Watch the live env vars on make up, not just the file — Databricks env passthrough into Docker
    • § Stale containers hide UI work — re-run make up after every change
  • ADR-PROTO-001..005 in prd-v2.1.md §C.8 (Kafka diagram-only; SQL gen anchored to metrics_catalog; SQL gen in FastAPI not Lambda; dictionary in prompt context; per-metric routing)