Skip to content

Backstage TechDocs + Project Documentation Rollout

Active plan in docs/plans/active/backstage-techdocs-rollout.md. Total budget ~10-12h across 5 phases. Phase 1 must ship before any Phase 2+ work — the things this session built (data dictionary loader, routing validator, lineage columns, l3_asurion seeder) have no standalone documentation today and that's the urgent gap.

Why this plan exists

Two pressures collided:

  1. Prompt 2 shipped a substantial subsystem with zero standalone documentation. The data-dictionary workflow, app.sql_gen package, config/metric_routing.yaml boot validator, dual-DDL metrics_catalog lineage discipline, and seed-databricks-l3 workflow all live only in code comments and the docs/plans/active/prompt-2-dictionary-loader.md execution log. New contributors need a real docs/ page per surface.
  2. Backstage IDP onboarding is blocked on missing pre-requisites. Backstage needs catalog-info.yaml, an OpenAPI spec at a stable path, and mkdocs.yml to consume the docs site as TechDocs. The repo has none today and the 16 ADRs referenced everywhere by name are still embedded inside prd.md and prd-v2.1.md, not extracted as individual files Backstage can render.

Target end-state

flowchart LR
    subgraph repo[2026-hackathon repo]
        catalog["catalog-info.yaml"]
        mkdocs["mkdocs.yml"]
        openapi["docs/api/openapi.yaml"]
        adrs["docs/adrs/*.md (16 files)"]
        guides["docs/architecture.md<br/>docs/sql-generator.md<br/>docs/data-dictionary.md<br/>docs/whats-mocked-in-prototype.md<br/>docs/demo-queries.md"]
        existing["docs/widget-builder.md<br/>docs/demo-runbook.md<br/>docs/lessons-learned.md"]
    end
    subgraph backstage[Backstage IDP]
        sw["Software Catalog<br/>(Component, System, Domain)"]
        td["TechDocs<br/>(mkdocs build)"]
        apient["API entity"]
    end
    catalog --> sw
    catalog --> apient
    openapi --> apient
    mkdocs --> td
    guides --> td
    adrs --> td
    existing --> td

Phase 1 — Capture Prompt 2 deliverables (3-4h, REQUIRED)

Goal: every surface this session shipped has a discoverable docs page that a new contributor can read end-to-end.

New pages

  • docs/architecture.md — top-level architecture overview with a single mermaid diagram covering FastAPI + Postgres + Redis + Databricks + Bedrock; cross-links to all subsystem pages. References backend/app/main.py lifespan as the canonical boot order.
  • docs/sql-generator.md — covers the app.sql_gen package shipped this session: data_dictionary models, dictionary_loader LRU caching, type_mapper rules, routing boot-validator fail-loud contract. Walks through the read path. Cross-links to prd-v2.1.md §C.4 and ADR-PROTO-002/004/005.
  • docs/data-dictionary.md — explains the four CSVs + ai_query_guidelines.md under data-dictionary/, the canonical-source-of-truth discipline, the make seed-databricks-l3 flow that renders DDL on the fly via backend/app/sql_gen/type_mapper.py, and the make validate-dictionary two-pass behaviour.
  • docs/whats-mocked-in-prototype.md — Q&A defense for stakeholder questions per prd-v2.1.md §C.7. Pairs each Part-B production element with its Part-C prototype mock or deferral (Kafka → diagram-only per ADR-PROTO-001; Bronze/Silver/Gold → asurion_prototype mirror; vector DB → in-prompt dictionary; etc.).
  • docs/demo-queries.md — the 3 Databricks-routed metrics (claim_volume_l3_asurion, claims_by_product_l3_asurion, claim_status_mix_l3_asurion) with sample input parameters, generated SQL, and expected output shape. Sourced from backend/app/metrics/seed.py.

Edits to existing pages

  • docs/lessons-learned.md — append four entries from this session:
    • Dual-DDL source of truth (db/init.sql + _TABLE_DDL in backend/app/metrics/catalog.py)
    • MetricEntity Literal entrenchment (14+ usages — keep entity bare, schema lives on source_schema)
    • custom_widget_placeholder orphan from old custom-widget persists — fail-loud caught it
    • make demo-reset requires running api (chicken-and-egg when boot-validator blocks startup) — direct TRUNCATE is the workaround
  • CLAUDE.md reading-order section — add the new docs pages so agents pick them up before changing the SQL gen surface.
  • docs/plans/active/part-c-databricks-prototype.md — flip prompt_2_dictionary_loader final acceptance to "validated this session, see prompt-2-dictionary-loader plan execution log".

Phase 2 — Extract ADRs into a folder (2-3h)

Backstage TechDocs renders one ADR per page. Today they're embedded headings inside two ~1000-line PRD files which is hostile to discovery.

  • Create docs/adrs/ folder with one file per ADR using the standard 4-section template (Status / Context / Decision / Consequences). 16 files total:
    • From prd.md §19: ADR-001..010 (lines 830-980 approx — verify offsets at edit time).
    • From prd-v2.1.md §C.8: ADR-PROTO-001..005 (line 363+).
    • From prd-v2.1.md §V2: ADR-V2-001 (line 1109+).
  • Add docs/adrs/README.md index page with a status table.
  • Do NOT delete the embedded copies in the PRDs yet — instead add a "Canonical: see docs/adrs/ADR-XXX.md" pointer at each PRD heading. Two-step migration is safer than a rip-and-replace; if the rendered ADRs read well after Phase 4, a future cleanup PR removes the PRD copies.
  • Update every existing cross-reference (search the repo for ADR- — found in CLAUDE.md, docs/widget-builder.md, docs/plans/, code comments) to link the new files.

Phase 3 — OpenAPI spec curation + export (1.5-2h)

FastAPI emits a usable OpenAPI doc at /openapi.json already. Backstage's API entity needs a stable file path and ideally well-described routes.

  • Inventory routes across the 4 router files: backend/app/main.py, backend/app/dashboard/routes.py, backend/app/widgets/routes.py, backend/app/metrics/routes.py. Add tags=, summary=, and per-route docstrings where missing so the exported spec is review-quality.
  • Add a docs/api/openapi.yaml exported from a running api. Source: curl http://localhost:8000/openapi.json | yq -P > docs/api/openapi.yaml.
  • Add scripts/export-openapi.py — imports app.main:app, dumps OpenAPI to docs/api/openapi.yaml without needing the server running. Wire make export-openapi target in Makefile.
  • Add a CI-friendly diff check (Phase 5 work; flagged here as a hand-off) that fails when docs/api/openapi.yaml is out of date relative to the live spec.

Phase 4 — Backstage IDP wiring (3h)

This is the actual Spotify Backstage onboarding.

  • catalog-info.yaml at repo root with three entities:
    • Domain: command-center — top-level grouping.
    • System: asurion-command-center — owns the api + frontend + databricks dependency.
    • Component: command-center-api — kind=service, lifecycle=experimental, owner=user:jane.smith, points at dockerfile: backend/Dockerfile, has providesApis: [command-center-api] and consumesApis: [databricks-sql-warehouse].
    • API: command-center-api — type=openapi, definition referencing docs/api/openapi.yaml.
    • Annotations: backstage.io/techdocs-ref: dir:., github.com/project-slug if applicable.
  • mkdocs.yml at repo root with the mkdocs-techdocs-core plugin enabled, nav: covering Phase 1 + 2 pages plus the existing PRDs / widget-builder / lessons-learned / demo-runbook. Use mkdocs-mermaid2-plugin for the architecture diagrams already in the docs.
  • Restructure docs/ to fit the TechDocs convention if needed: keep current files at top level, group ADRs under docs/adrs/, group API specs under docs/api/. Plans and the PRD trio stay where they are; nav can reference them as-is.
  • Local preview path: make docs-serve runs docker run -p 8001:8000 -v $(PWD):/content spotify/techdocs:v1.5.0 (or the equivalent techdocs-cli serve) so docs can be reviewed before publishing.
  • Document the publish path in docs/architecture.md Operations section but DO NOT wire CI publishing in this plan — that depends on whether the user has an actual Backstage instance to point at, which is out-of-scope for the hackathon.

Phase 5 — Tooling + governance (1h)

  • make docs-serve — local TechDocs preview (Phase 4).
  • make export-openapi — Phase 3 deliverable wired into the Makefile.
  • make docs-validate — runs mkdocs build --strict (catches broken internal links + missing nav entries) and a basic OpenAPI lint (e.g. npx @redocly/cli lint docs/api/openapi.yaml). Both are containerised so no host deps.
  • Add a "Documentation discipline" call-out to CLAUDE.md with three rules:
    1. New subsystems require a docs/<subsystem>.md page in the same PR as the code.
    2. ADRs land as new files in docs/adrs/ — never as new headings buried in a PRD.
    3. API contract changes require make export-openapi to refresh docs/api/openapi.yaml in the same PR.
  • Update AGENTS.md "Canonical project docs" table with the new pages.

Risks + open questions

  • Backstage instance availability. Plan stops at "the repo can be onboarded" — actually pointing a Backstage instance at it is out-of-scope. If a real instance exists, Phase 4 grows by 30-60min for the connector config.
  • mkdocs vs TechDocs version drift. Pin mkdocs-techdocs-core to whatever version the target Backstage instance ships (typically v1.x). If unknown, pin to the latest stable and document it.
  • PRD copy of ADRs becomes stale post-extraction. Mitigated by the two-step migration in Phase 2 (pointer remains, content lives in docs/adrs/) and the Phase-5 docs-validate check that flags drift.
  • Time-box overrun likelihood. The ADR extraction (Phase 2) is the highest-variance task because the four-section template requires actual editorial work, not just file moves. If 2-3h becomes 4-5h, defer ADR-V2-001 + ADR-001 (both well-stabilised) and ship the 14 actively-cited ones.

Acceptance gates

  1. Phase 1: Five new docs pages exist; CLAUDE.md reading order includes them; running grep -r "app.sql_gen" docs/ returns content (not a void).
  2. Phase 2: ls docs/adrs/*.md | wc -l returns 16; every PRD ADR heading carries a Canonical: pointer; grep -r "ADR-" docs/ backend/ frontend/ shows live links.
  3. Phase 3: make export-openapi produces docs/api/openapi.yaml; npx @redocly/cli lint exits 0; the spec contains all 4 router groups with non-empty descriptions.
  4. Phase 4: mkdocs build --strict exits 0; catalog-info.yaml validates under npx @backstage/cli config:check (or the manual schema check); make docs-serve renders the site locally.
  5. Phase 5: make docs-validate is green; CLAUDE.md carries the three new documentation discipline rules.

Cross-references