fix(dbt): deduplicate int_ks4_with_lineage predecessor rows
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
When multiple predecessor URNs exist for the same current school and year, use DISTINCT ON to keep the one with the most pupils — matching the same logic already in int_ks2_with_lineage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -16,7 +16,8 @@ with current_ks4 as (
|
|||||||
),
|
),
|
||||||
|
|
||||||
predecessor_ks4 as (
|
predecessor_ks4 as (
|
||||||
select
|
-- If multiple predecessors have data for the same year, keep the one with most pupils.
|
||||||
|
select distinct on (lin.current_urn, ks4.year)
|
||||||
lin.current_urn,
|
lin.current_urn,
|
||||||
ks4.urn as source_urn,
|
ks4.urn as source_urn,
|
||||||
ks4.year, ks4.total_pupils, ks4.eligible_pupils, ks4.prior_attainment_avg,
|
ks4.year, ks4.total_pupils, ks4.eligible_pupils, ks4.prior_attainment_avg,
|
||||||
@@ -35,6 +36,7 @@ predecessor_ks4 as (
|
|||||||
where curr.urn = lin.current_urn
|
where curr.urn = lin.current_urn
|
||||||
and curr.year = ks4.year
|
and curr.year = ks4.year
|
||||||
)
|
)
|
||||||
|
order by lin.current_urn, ks4.year, ks4.total_pupils desc nulls last
|
||||||
),
|
),
|
||||||
|
|
||||||
combined as (
|
combined as (
|
||||||
|
|||||||
Reference in New Issue
Block a user