fix(dbt): deduplicate int_ks4_with_lineage predecessor rows
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s

When multiple predecessor URNs exist for the same current school and
year, use DISTINCT ON to keep the one with the most pupils — matching
the same logic already in int_ks2_with_lineage.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-28 18:58:50 +00:00
parent f0c76a1724
commit f3a8ebdb4b

View File

@@ -16,7 +16,8 @@ with current_ks4 as (
), ),
predecessor_ks4 as ( predecessor_ks4 as (
select -- If multiple predecessors have data for the same year, keep the one with most pupils.
select distinct on (lin.current_urn, ks4.year)
lin.current_urn, lin.current_urn,
ks4.urn as source_urn, ks4.urn as source_urn,
ks4.year, ks4.total_pupils, ks4.eligible_pupils, ks4.prior_attainment_avg, ks4.year, ks4.total_pupils, ks4.eligible_pupils, ks4.prior_attainment_avg,
@@ -35,6 +36,7 @@ predecessor_ks4 as (
where curr.urn = lin.current_urn where curr.urn = lin.current_urn
and curr.year = ks4.year and curr.year = ks4.year
) )
order by lin.current_urn, ks4.year, ks4.total_pupils desc nulls last
), ),
combined as ( combined as (