feat: migrate backend to marts schema, update EES tap for verified datasets

Pipeline:
- EES tap: split KS4 into performance + info streams, fix admissions filename
  (SchoolLevel keyword match), fix census filename (yearly suffix), remove
  phonics (no school-level data on EES), change endswith → in for matching
- stg_ees_ks4: rewrite to filter long-format data and extract Attainment 8,
  Progress 8, EBacc, English/Maths metrics; join KS4 info for context
- stg_ees_admissions: map real CSV columns (total_number_places_offered, etc.)
- stg_ees_census: update source reference, stub with TODO for data columns
- Remove stg_ees_phonics, fact_phonics (no school-level EES data)
- Add ees_ks4_performance + ees_ks4_info sources, remove ees_ks4 + ees_phonics
- Update int_ks4_with_lineage + fact_ks4_performance with new KS4 columns
- Annual EES DAG: remove stg_ees_phonics+ from selector

Backend:
- models.py: replace all models to point at marts.* tables with schema='marts'
  (DimSchool, DimLocation, KS2Performance, FactOfstedInspection, etc.)
- data_loader.py: rewrite load_school_data_as_dataframe() using raw SQL joining
  dim_school + dim_location + fact_ks2_performance; update get_supplementary_data()
- database.py: remove migration machinery, keep only connection setup
- app.py: remove check_and_migrate_if_needed, remove /api/admin/reimport-ks2
  endpoints (pipeline handles all imports)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-27 09:29:27 +00:00
parent d82e36e7b2
commit ca351e9d73
18 changed files with 805 additions and 1245 deletions

View File

@@ -88,14 +88,6 @@ models:
- name: year
tests: [not_null]
- name: fact_phonics
description: Phonics screening results — one row per URN per year
columns:
- name: urn
tests: [not_null]
- name: year
tests: [not_null]
- name: fact_parent_view
description: Parent View survey responses
columns:

View File

@@ -3,8 +3,12 @@
select
urn,
year,
school_phase,
published_admission_number,
total_applications,
first_preference_offers_pct,
oversubscribed
first_preference_applications,
first_preference_offers,
first_preference_offer_pct,
oversubscribed,
admissions_policy
from {{ ref('stg_ees_admissions') }}

View File

@@ -1,16 +1,42 @@
-- Mart: KS4 performance fact table — one row per URN per year
-- Includes predecessor data via lineage resolution
select
current_urn as urn,
source_urn,
year,
total_pupils,
progress_8_score,
eligible_pupils,
prior_attainment_avg,
-- Attainment 8
attainment_8_score,
ebacc_entry_pct,
ebacc_achievement_pct,
english_strong_pass_pct,
maths_strong_pass_pct,
-- Progress 8
progress_8_score,
progress_8_lower_ci,
progress_8_upper_ci,
progress_8_english,
progress_8_maths,
progress_8_ebacc,
progress_8_open,
-- English & Maths
english_maths_strong_pass_pct,
staying_in_education_pct
english_maths_standard_pass_pct,
-- EBacc
ebacc_entry_pct,
ebacc_strong_pass_pct,
ebacc_standard_pass_pct,
ebacc_avg_score,
-- GCSE
gcse_grade_91_pct,
-- Context
sen_pct,
sen_ehcp_pct,
sen_support_pct
from {{ ref('int_ks4_with_lineage') }}

View File

@@ -1,8 +0,0 @@
-- Mart: Phonics screening results — one row per URN per year
select
urn,
year,
year1_phonics_pct,
year2_phonics_pct
from {{ ref('stg_ees_phonics') }}

View File

@@ -1,18 +1,8 @@
-- Mart: Pupil characteristics — one row per URN per year
-- TODO: Expand once census data columns are verified and added to staging
select
urn,
year,
fsm_pct,
sen_support_pct,
sen_ehcp_pct,
eal_pct,
disadvantaged_pct,
ethnicity_white_pct,
ethnicity_asian_pct,
ethnicity_black_pct,
ethnicity_mixed_pct,
ethnicity_other_pct,
class_size_avg,
stability_pct
phase_type_grouping
from {{ ref('int_pupil_chars_merged') }}