Pipeline:
- EES tap: split KS4 into performance + info streams, fix admissions filename
(SchoolLevel keyword match), fix census filename (yearly suffix), remove
phonics (no school-level data on EES), change endswith → in for matching
- stg_ees_ks4: rewrite to filter long-format data and extract Attainment 8,
Progress 8, EBacc, English/Maths metrics; join KS4 info for context
- stg_ees_admissions: map real CSV columns (total_number_places_offered, etc.)
- stg_ees_census: update source reference, stub with TODO for data columns
- Remove stg_ees_phonics, fact_phonics (no school-level EES data)
- Add ees_ks4_performance + ees_ks4_info sources, remove ees_ks4 + ees_phonics
- Update int_ks4_with_lineage + fact_ks4_performance with new KS4 columns
- Annual EES DAG: remove stg_ees_phonics+ from selector
Backend:
- models.py: replace all models to point at marts.* tables with schema='marts'
(DimSchool, DimLocation, KS2Performance, FactOfstedInspection, etc.)
- data_loader.py: rewrite load_school_data_as_dataframe() using raw SQL joining
dim_school + dim_location + fact_ks2_performance; update get_supplementary_data()
- database.py: remove migration machinery, keep only connection setup
- app.py: remove check_and_migrate_if_needed, remove /api/admin/reimport-ks2
endpoints (pipeline handles all imports)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two issues caused the backend to drop and reimport school data on restart:
1. schema_version table was in the drop list inside run_full_migration(),
so after any migration the breadcrumb was destroyed and the next
restart would see no version → re-trigger migration
2. Schema version was set after migration, so a crash mid-migration
left no version → infinite re-migration loop
Fix: remove schema_version from the drop list, and set the version
before running migration so crashes don't cause loops.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On startup, the app now checks if the database schema version matches
the code. If there's a mismatch or no version exists, it automatically
runs a full data migration before starting.
- Add backend/version.py with SCHEMA_VERSION constant
- Add backend/migration.py with extracted migration logic
- Add SchemaVersion model to track DB version
- Add version check functions to database.py
- Update app.py lifespan to use check_and_migrate_if_needed()
- Simplify migrate_csv_to_db.py to use shared logic
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>