Skip to content

ADR-PROTO-002: SQL generator anchored to metrics_catalog (not free-text-to-SQL)

Status: Accepted — implemented in Prompt 3 (commit pending) Source: prd-v2.1.md §C.8 Implementation: backend/app/sql_gen/generator.py, safety.py, routes.py, prompts.py, exceptions.py.

Context

The original SQL generator design proposed free-text NL-to-SQL via Bedrock. Free-text generation is the failure mode of every NL-to-SQL system: LLM invents columns, hallucinates joins, bypasses governance. The metric catalog already defines metric_id, formula, source_table, source_query per ADR-007.

Decision

SQL generation is anchored to a metrics_catalog entry. The LLM's job is bounded: take the catalog entry's source_query template, fill in dimensions and filters from the data_intent, return validated SQL. Free-text generation is rejected at the resolver layer (no metric_id → no Databricks query).

The structural contract is enforced in three places:

  1. Route layer (backend/app/sql_gen/routes.py) — pre-flight rejects requests with no metric_id (or blank), returning 422 application/problem+json with kind=free_text_rejected. The five-way RFC 7807 mapping (free_text_rejected / metric_not_found / metric_not_databricks / safety_violation / bedrock_unavailable) lives here.
  2. Orchestrator (backend/app/sql_gen/generator.py) — resolves metric_id against metrics_catalog, gates on routing.backend == 'databricks', scopes the data dictionary subset to the metric's source_schema.source_table plus joinable parents, and refuses to call Bedrock for any non-Databricks metric.
  3. Safety layer (backend/app/sql_gen/safety.py) — even if the LLM emits something off-script, validate(sql, allowlisted_tables=…) rejects non-allowlisted tables, multi-statement input, forbidden DDL/DML constructs, and LIMIT > max_result_rows. Always parses with dialect='databricks'.

Consequences

  • SQL quality is dramatically higher than freeform NL-to-SQL.
  • Safety is structural, not bolted on.
  • Adding a new Databricks metric requires a metrics_catalog row first — by design, not by accident.
  • One-off "ad hoc Databricks query" use case is not supported in the prototype (would require a separate code path with much weaker safety).
  • All rejection cases (free-text input with no metric_id, metric_id not in metrics_catalog, metric_id whose backend is postgres) return RFC 7807 application/problem+json consistent with the v1 error contract. The route layer never returns a SQL string for a non-Databricks metric.

Cross-references