feat(pipeline): add legacy KS4 backfill (2015/16–2018/19)
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 12s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 52s
Build and Push Docker Images / Trigger Portainer Update (push) Has been cancelled
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Has been cancelled

Mirrors the existing legacy KS2 pattern to fill the gap before EES hosted
KS4 data. Four files changed:

- tap-uk-ees: LegacyKS4Stream downloads each year's DfE Compare School
  Performance ZIP, extracts england_ks4final.csv, maps 416 legacy columns
  to Singer fields, strips % suffixes. Registered in discover_streams().
  TapUKEES.config_jsonschema gains legacy_ks4_urls setting.

- stg_legacy_ks4.sql: safe_numeric casts + NULL placeholders for columns
  not present in legacy format (ebacc_avg_score, gcse_grade_91_pct,
  prior_attainment_avg, sen_pct).

- int_ks4_with_lineage.sql: adds all_ks4 CTE unioning stg_ees_ks4 and
  stg_legacy_ks4, matching the int_ks2_with_lineage pattern.

- _stg_sources.yml + meltano.yml: source declaration and setting definition
  for legacy_ks4. URLs configured per-year once provided.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Tudor Sitaru
2026-04-16 10:37:24 +01:00
parent 3401654ab9
commit 7e6ded29e2
5 changed files with 206 additions and 6 deletions
@@ -1,6 +1,13 @@
-- Intermediate model: KS4 data chained across academy conversions
-- Unions EES (2023/24 onwards) and legacy (2015/162018/19) school-level data
with current_ks4 as (
with all_ks4 as (
select * from {{ ref('stg_ees_ks4') }}
union all
select * from {{ ref('stg_legacy_ks4') }}
),
current_ks4 as (
select
urn as current_urn,
urn as source_urn,
@@ -11,8 +18,8 @@ with current_ks4 as (
english_maths_strong_pass_pct, english_maths_standard_pass_pct,
ebacc_entry_pct, ebacc_strong_pass_pct, ebacc_standard_pass_pct, ebacc_avg_score,
gcse_grade_91_pct,
sen_pct, sen_ehcp_pct, sen_support_pct
from {{ ref('stg_ees_ks4') }}
sen_pct, sen_support_pct, sen_ehcp_pct
from all_ks4
),
predecessor_ks4 as (
@@ -27,12 +34,12 @@ predecessor_ks4 as (
ks4.english_maths_strong_pass_pct, ks4.english_maths_standard_pass_pct,
ks4.ebacc_entry_pct, ks4.ebacc_strong_pass_pct, ks4.ebacc_standard_pass_pct, ks4.ebacc_avg_score,
ks4.gcse_grade_91_pct,
ks4.sen_pct, ks4.sen_ehcp_pct, ks4.sen_support_pct
from {{ ref('stg_ees_ks4') }} ks4
ks4.sen_pct, ks4.sen_support_pct, ks4.sen_ehcp_pct
from all_ks4 ks4
inner join {{ ref('int_school_lineage') }} lin
on ks4.urn = lin.predecessor_urn
where not exists (
select 1 from {{ ref('stg_ees_ks4') }} curr
select 1 from all_ks4 curr
where curr.urn = lin.current_urn
and curr.year = ks4.year
)