Skip to content

Architecture

Top-level architecture of the Asurion Command Center hackathon stack. Read this first; subsystem-specific docs (widget-builder.md, sql-generator.md, data-dictionary.md) take you deeper. Source files cited inline so you can verify any claim against the code.

What this stack is

A real-time NPS-impact command center for support ops, built as a hackathon MVP. Five moving parts, all wired through a single FastAPI process:

  1. A Postgres datastore seeded with synthetic dashboard data (issue sessions, claims, devices, alerts, recommendations) plus a first-class metrics_catalog table (ADR-007) and a JSONB dashboard_layouts doc per user (ADR-009).
  2. A Redis pub/sub channel (pubsub:dashboard) the WebSocket endpoint at /v1/dashboard/stream mirrors out to connected browsers.
  3. The FastAPI app at backend/app/main.py — ingest, dashboard, decisions, widgets, metrics, demo admin, plus a Databricks health endpoint.
  4. The React frontend at frontend/ — dashboard tiles, the Add Widget Clarifier modal, the SourceBadge / MetricInfoBadge surfaces.
  5. Two outbound integrations:
    • AWS Bedrock for the Add Widget Clarifier (LangGraph-driven, ADR-005) and the Part C SQL generator (PRD v2.1 §C.4, in flight).
    • Databricks SQL Warehouse for the Part C live-data path. The personal workspace runs both workspace.asurion_prototype.* (v1 mirror, hits /v1/databricks/health) and workspace.l3_asurion.* (the dictionary-shaped tables Prompt 2 seeded — see data-dictionary.md).
flowchart LR
    browser["React UI<br/>(:3080)"]
    api["FastAPI<br/>(:8000)<br/>app.main"]
    pg[("Postgres<br/>cmdcenter")]
    rd[("Redis<br/>pubsub:dashboard")]
    bedrock["AWS Bedrock<br/>(Anthropic Claude)"]
    dbx_v1[("Databricks<br/>workspace.asurion_prototype")]
    dbx_l3[("Databricks<br/>workspace.l3_asurion<br/>(dictionary-shaped)")]
    dict["data-dictionary/<br/>(4 CSVs + guidelines)"]
    routing["config/metric_routing.yaml"]
    kafka["Kafka<br/>Phase 2 — production transport<br/>(ADR-PROTO-001, diagram only)"]

    browser -->|REST| api
    browser -->|WebSocket| api
    api -->|SQL| pg
    api -->|pub/sub| rd
    api -->|tool-use| bedrock
    api -->|/health probe| dbx_v1
    api -->|live SQL Part C| dbx_l3
    dict -.->|ro mount| api
    routing -.->|ro mount| api
    kafka -.->|deferred| api

    classDef phase2 stroke-dasharray:5 5,stroke:#888,color:#666;
    class kafka phase2;

The dashed kafka node is a placeholder per ADR-PROTO-001 — production replaces the direct POST /v1/events ingest path with Kafka, but the prototype proves the more uncertain piece (live data binding) and keeps event transport synchronous. See whats-mocked-in-prototype.md § Kafka (DIAGRAM ONLY).

Boot order — app.main.lifespan

Every gate runs inside one engine.connect() block so the routing validator observes the seeded rows. Source: backend/app/main.py lines 34-53.

sequenceDiagram
    autonumber
    participant uvi as uvicorn
    participant lifespan as app.main.lifespan
    participant pg as Postgres
    participant routing as routing.validate_routing_against_catalog
    uvi->>lifespan: startup
    lifespan->>pg: _ensure_widgets_table
    lifespan->>pg: ensure_metrics_table (DDL + lineage ALTERs)
    lifespan->>pg: ensure_dashboard_layouts_table
    lifespan->>pg: seed_metrics_if_empty (13 rows)
    lifespan->>pg: seed_if_empty (synthetic dashboard data)
    lifespan->>routing: validate_routing_against_catalog(conn)
    routing-->>lifespan: ok or RuntimeError (ADR-008 fail-loud)
    lifespan-->>uvi: yield

Failure semantics: if any step raises, the lifespan never yields and Docker keeps the api container in a restart loop. The operator sees the full traceback in docker compose logs api. There is no silent recovery path — every gate is fail-loud per ADR-008. The classic example is the routing validator catching a metrics_catalog row without a corresponding entry in config/metric_routing.yaml; see sql-generator.md § Boot validator.

Service topology — docker-compose.yml

Service Image Role Notes
db postgres:16-alpine Source of truth for issue sessions, claims, devices, widgets, metric catalog, dashboard layouts db/init.sql runs once on first boot. The lifespan migrations (ensure_*_table) handle additive schema changes for warm dev DBs.
redis redis:7-alpine pubsub:dashboard fan-out for the WebSocket No persistence required.
api Built from backend/Dockerfile The FastAPI app + LangGraph Clarifier + Databricks client + dictionary loader Mounts ./data-dictionary:/app/data-dictionary:ro and ./config:/app/config:ro for Part C.
web Built from frontend/Dockerfile The React UI Talks to api via relative URLs in production-mode Vite build.

The api container also has read-write access to ${HOME}/.aws so boto3 can refresh SSO credentials on demand — :ro was tried and silently breaks Bedrock auth (lessons-learned § Stale containers hide UI work).

API surface

Routers live in backend/app/. All paths are prefixed with /v1 except /health. The full curated OpenAPI spec lives at api/openapi.yaml; the live one is always at http://localhost:8000/openapi.json.

Router Mount Owner doc
app.main top-level routes (/health, /v1/databricks/health, /v1/dashboard/state, /v1/decisions/next-best-action, /v1/feedback/outcome, /v1/admin/reset-demo, WebSocket /v1/dashboard/stream) this doc
app.widgets.routes /v1/widgets/* (Add Widget Clarifier SSE + persist) widget-builder.md
app.metrics.routes /v1/metrics/* (catalog CRUD) this doc + ADR-007
app.dashboard.routes /v1/dashboard/layout/* (drag-reorder layout per ADR-009) this doc + ADR-009

Configuration surface

Env vars (declared in docker-compose.yml)

Group Key vars Purpose
Database DATABASE_URL, REDIS_URL Container internal; never exposed externally.
Bedrock BUILDER_MODE (live|offline), USE_BEDROCK, AWS_PROFILE, AWS_REGION, BEDROCK_MODEL_ID ADR-008. Default live — Bedrock failures surface as 503, never silent MockLlm.
Databricks DATABRICKS_HOST, DATABRICKS_HTTP_PATH, DATABRICKS_TOKEN, DATABRICKS_CATALOG, DATABRICKS_SCHEMA, DATABRICKS_HEALTH_TABLE, DATABRICKS_QUERY_TIMEOUT_S, DATABRICKS_POOL_SIZE Part C. Default catalog workspace, schema asurion_prototype.
Dictionary / routing DATA_DICTIONARY_ROOT (/app/data-dictionary), METRIC_ROUTING_PATH (/app/config/metric_routing.yaml) Part C. Mounted read-only into the container.

Mounted config

Postgres schema

Defined in db/init.sql. The metrics_catalog table has a dual source of truth — its CREATE TABLE is duplicated in backend/app/metrics/catalog.py _TABLE_DDL so the lifespan migration path agrees with the fresh-DB path. Edits land in BOTH or fresh DBs and dev DBs diverge. This convention is documented in docs/lessons-learned.md § Dual-DDL source of truth.

ADRs that shape this architecture

The full set lives under docs/adrs/. The ones that most directly shape this top-level view:

  • ADR-001 — Postgres + Redis Compose, deferring the production data path.
  • ADR-008 — Mocks-as-opt-in / fail-loud. Drives the lifespan boot gates, the Bedrock 503 path, and the routing validator.
  • ADR-007metrics_catalog first-class + per-widget MetricDefinition block. Why widgets carry an embedded metric instead of a string label.
  • ADR-009 — Drag-reorder layout via a single JSONB dashboard_layouts doc.
  • ADR-PROTO-001..005 — Part C decisions: Kafka diagram-only, SQL gen anchored to catalog, SQL gen in FastAPI not Lambda, dictionary in prompt context (not vector DB), per-metric routing.

Forward-looking — Part B and beyond

prd-v2.1.md Part B describes the 4-week production sprint. The headline shifts:

  • Kafka transport replaces direct event ingest (ADR-V2-001).
  • Bronze / Silver / Gold lakehouse layers in Databricks replace the synthetic Postgres for the analytical surface.
  • A Mosaic AI gateway fronts Bedrock with rate limiting and guardrails.
  • Vector DB joins the dictionary loader for similarity-aware metric matching.
  • Iframe sandbox replaces @babel/standalone in-process compilation (ADR-006 evolution).

Part C (the slice in flight) deliberately mocks or defers all of those — see whats-mocked-in-prototype.md for the line-by-line accounting.

Operations

Local stack

1
2
3
4
5
6
7
8
9
make up                  # docker compose up --build -d
make demo-reset          # POST /v1/admin/reset-demo (requires running api)
make logs                # tail api logs
make seed-databricks     # workspace.asurion_prototype.* (v1 mirror)
make seed-databricks-l3  # workspace.l3_asurion.* (dictionary-shaped, Part C)
make validate-dictionary # structural + live dictionary validation
make test                # backend pytest inside the api container
make test-frontend       # Vitest, no docker
make verify              # scripts/verify-acceptance.sh end-to-end smoke

TechDocs preview

This site is rendered by Backstage TechDocs. To preview locally, see Phase 4 of the rollout plan: make docs-serve runs the spotify/techdocs container against the repo's mkdocs.yml.

Publish path

Out of scope for the hackathon — the plan ships the manifests so an operator can register the repo with a Backstage instance, but no CI publish action is wired. See the rollout plan's Risks + open questions section.