Backstage TechDocs + Project Documentation Rollout¶
Active plan in docs/plans/active/backstage-techdocs-rollout.md. Total budget ~10-12h across 5 phases. Phase 1 must ship before any Phase 2+ work — the things this session built (data dictionary loader, routing validator, lineage columns, l3_asurion seeder) have no standalone documentation today and that's the urgent gap.
Why this plan exists¶
Two pressures collided:
- Prompt 2 shipped a substantial subsystem with zero standalone documentation. The data-dictionary workflow,
app.sql_genpackage,config/metric_routing.yamlboot validator, dual-DDLmetrics_cataloglineage discipline, andseed-databricks-l3workflow all live only in code comments and the docs/plans/active/prompt-2-dictionary-loader.md execution log. New contributors need a realdocs/page per surface. - Backstage IDP onboarding is blocked on missing pre-requisites. Backstage needs
catalog-info.yaml, an OpenAPI spec at a stable path, andmkdocs.ymlto consume the docs site as TechDocs. The repo has none today and the 16 ADRs referenced everywhere by name are still embedded inside prd.md and prd-v2.1.md, not extracted as individual files Backstage can render.
Target end-state¶
flowchart LR
subgraph repo[2026-hackathon repo]
catalog["catalog-info.yaml"]
mkdocs["mkdocs.yml"]
openapi["docs/api/openapi.yaml"]
adrs["docs/adrs/*.md (16 files)"]
guides["docs/architecture.md<br/>docs/sql-generator.md<br/>docs/data-dictionary.md<br/>docs/whats-mocked-in-prototype.md<br/>docs/demo-queries.md"]
existing["docs/widget-builder.md<br/>docs/demo-runbook.md<br/>docs/lessons-learned.md"]
end
subgraph backstage[Backstage IDP]
sw["Software Catalog<br/>(Component, System, Domain)"]
td["TechDocs<br/>(mkdocs build)"]
apient["API entity"]
end
catalog --> sw
catalog --> apient
openapi --> apient
mkdocs --> td
guides --> td
adrs --> td
existing --> td
Phase 1 — Capture Prompt 2 deliverables (3-4h, REQUIRED)¶
Goal: every surface this session shipped has a discoverable docs page that a new contributor can read end-to-end.
New pages
- docs/architecture.md — top-level architecture overview with a single mermaid diagram covering FastAPI + Postgres + Redis + Databricks + Bedrock; cross-links to all subsystem pages. References backend/app/main.py lifespan as the canonical boot order.
- docs/sql-generator.md — covers the
app.sql_genpackage shipped this session:data_dictionarymodels,dictionary_loaderLRU caching,type_mapperrules,routingboot-validator fail-loud contract. Walks through the read path. Cross-links to prd-v2.1.md §C.4 and ADR-PROTO-002/004/005. - docs/data-dictionary.md — explains the four CSVs +
ai_query_guidelines.mdunder data-dictionary/, the canonical-source-of-truth discipline, themake seed-databricks-l3flow that renders DDL on the fly via backend/app/sql_gen/type_mapper.py, and themake validate-dictionarytwo-pass behaviour. - docs/whats-mocked-in-prototype.md — Q&A defense for stakeholder questions per prd-v2.1.md §C.7. Pairs each Part-B production element with its Part-C prototype mock or deferral (Kafka → diagram-only per ADR-PROTO-001; Bronze/Silver/Gold → asurion_prototype mirror; vector DB → in-prompt dictionary; etc.).
- docs/demo-queries.md — the 3 Databricks-routed metrics (
claim_volume_l3_asurion,claims_by_product_l3_asurion,claim_status_mix_l3_asurion) with sample input parameters, generated SQL, and expected output shape. Sourced from backend/app/metrics/seed.py.
Edits to existing pages
- docs/lessons-learned.md — append four entries from this session:
- Dual-DDL source of truth (
db/init.sql+_TABLE_DDLin backend/app/metrics/catalog.py) MetricEntityLiteral entrenchment (14+ usages — keep entity bare, schema lives onsource_schema)custom_widget_placeholderorphan from old custom-widget persists — fail-loud caught itmake demo-resetrequires running api (chicken-and-egg when boot-validator blocks startup) — directTRUNCATEis the workaround
- Dual-DDL source of truth (
- CLAUDE.md reading-order section — add the new docs pages so agents pick them up before changing the SQL gen surface.
- docs/plans/active/part-c-databricks-prototype.md — flip
prompt_2_dictionary_loaderfinal acceptance to "validated this session, see prompt-2-dictionary-loader plan execution log".
Phase 2 — Extract ADRs into a folder (2-3h)¶
Backstage TechDocs renders one ADR per page. Today they're embedded headings inside two ~1000-line PRD files which is hostile to discovery.
- Create docs/adrs/ folder with one file per ADR using the standard 4-section template (Status / Context / Decision / Consequences). 16 files total:
- From prd.md §19: ADR-001..010 (lines 830-980 approx — verify offsets at edit time).
- From prd-v2.1.md §C.8: ADR-PROTO-001..005 (line 363+).
- From prd-v2.1.md §V2: ADR-V2-001 (line 1109+).
- Add
docs/adrs/README.mdindex page with a status table. - Do NOT delete the embedded copies in the PRDs yet — instead add a "Canonical: see docs/adrs/ADR-XXX.md" pointer at each PRD heading. Two-step migration is safer than a rip-and-replace; if the rendered ADRs read well after Phase 4, a future cleanup PR removes the PRD copies.
- Update every existing cross-reference (search the repo for
ADR-— found in CLAUDE.md, docs/widget-builder.md, docs/plans/, code comments) to link the new files.
Phase 3 — OpenAPI spec curation + export (1.5-2h)¶
FastAPI emits a usable OpenAPI doc at /openapi.json already. Backstage's API entity needs a stable file path and ideally well-described routes.
- Inventory routes across the 4 router files: backend/app/main.py, backend/app/dashboard/routes.py, backend/app/widgets/routes.py, backend/app/metrics/routes.py. Add
tags=,summary=, and per-route docstrings where missing so the exported spec is review-quality. - Add a
docs/api/openapi.yamlexported from a running api. Source:curl http://localhost:8000/openapi.json | yq -P > docs/api/openapi.yaml. - Add scripts/export-openapi.py — imports
app.main:app, dumps OpenAPI todocs/api/openapi.yamlwithout needing the server running. Wiremake export-openapitarget in Makefile. - Add a CI-friendly diff check (Phase 5 work; flagged here as a hand-off) that fails when
docs/api/openapi.yamlis out of date relative to the live spec.
Phase 4 — Backstage IDP wiring (3h)¶
This is the actual Spotify Backstage onboarding.
catalog-info.yamlat repo root with three entities:Domain: command-center— top-level grouping.System: asurion-command-center— owns the api + frontend + databricks dependency.Component: command-center-api— kind=service, lifecycle=experimental, owner=user:jane.smith, points atdockerfile: backend/Dockerfile, hasprovidesApis: [command-center-api]andconsumesApis: [databricks-sql-warehouse].API: command-center-api— type=openapi, definition referencingdocs/api/openapi.yaml.- Annotations:
backstage.io/techdocs-ref: dir:.,github.com/project-slugif applicable.
mkdocs.ymlat repo root with themkdocs-techdocs-coreplugin enabled,nav:covering Phase 1 + 2 pages plus the existing PRDs / widget-builder / lessons-learned / demo-runbook. Usemkdocs-mermaid2-pluginfor the architecture diagrams already in the docs.- Restructure docs/ to fit the TechDocs convention if needed: keep current files at top level, group ADRs under
docs/adrs/, group API specs underdocs/api/. Plans and the PRD trio stay where they are; nav can reference them as-is. - Local preview path:
make docs-serverunsdocker run -p 8001:8000 -v $(PWD):/content spotify/techdocs:v1.5.0(or the equivalenttechdocs-cli serve) so docs can be reviewed before publishing. - Document the publish path in
docs/architecture.mdOperations section but DO NOT wire CI publishing in this plan — that depends on whether the user has an actual Backstage instance to point at, which is out-of-scope for the hackathon.
Phase 5 — Tooling + governance (1h)¶
make docs-serve— local TechDocs preview (Phase 4).make export-openapi— Phase 3 deliverable wired into the Makefile.make docs-validate— runsmkdocs build --strict(catches broken internal links + missing nav entries) and a basic OpenAPI lint (e.g.npx @redocly/cli lint docs/api/openapi.yaml). Both are containerised so no host deps.- Add a "Documentation discipline" call-out to CLAUDE.md with three rules:
- New subsystems require a
docs/<subsystem>.mdpage in the same PR as the code. - ADRs land as new files in
docs/adrs/— never as new headings buried in a PRD. - API contract changes require
make export-openapito refreshdocs/api/openapi.yamlin the same PR.
- New subsystems require a
- Update AGENTS.md "Canonical project docs" table with the new pages.
Risks + open questions¶
- Backstage instance availability. Plan stops at "the repo can be onboarded" — actually pointing a Backstage instance at it is out-of-scope. If a real instance exists, Phase 4 grows by 30-60min for the connector config.
- mkdocs vs TechDocs version drift. Pin
mkdocs-techdocs-coreto whatever version the target Backstage instance ships (typically v1.x). If unknown, pin to the latest stable and document it. - PRD copy of ADRs becomes stale post-extraction. Mitigated by the two-step migration in Phase 2 (pointer remains, content lives in docs/adrs/) and the Phase-5 docs-validate check that flags drift.
- Time-box overrun likelihood. The ADR extraction (Phase 2) is the highest-variance task because the four-section template requires actual editorial work, not just file moves. If 2-3h becomes 4-5h, defer ADR-V2-001 + ADR-001 (both well-stabilised) and ship the 14 actively-cited ones.
Acceptance gates¶
- Phase 1: Five new docs pages exist; CLAUDE.md reading order includes them; running
grep -r "app.sql_gen" docs/returns content (not a void). - Phase 2:
ls docs/adrs/*.md | wc -lreturns 16; every PRD ADR heading carries aCanonical:pointer;grep -r "ADR-" docs/ backend/ frontend/shows live links. - Phase 3:
make export-openapiproducesdocs/api/openapi.yaml;npx @redocly/cli lintexits 0; the spec contains all 4 router groups with non-empty descriptions. - Phase 4:
mkdocs build --strictexits 0;catalog-info.yamlvalidates undernpx @backstage/cli config:check(or the manual schema check);make docs-serverenders the site locally. - Phase 5:
make docs-validateis green; CLAUDE.md carries the three new documentation discipline rules.
Cross-references¶
- docs/plans/active/prompt-2-dictionary-loader.md — what Phase 1 documents.
- docs/plans/active/part-c-databricks-prototype.md — Prompt 6 originally planned
docs/architecture.md,docs/whats-mocked-in-prototype.md,docs/sql-generator.md,docs/demo-queries.md; this plan absorbs that scope. - prompts.md lines 132-179 (Prompt 2) and the Prompt 6 documentation section.
- Backstage TechDocs documentation — for the mkdocs-techdocs-core plugin config.
- Backstage Software Catalog descriptor format — for
catalog-info.yamlschema.