The MI CSV contains both OEIF and RC column sets simultaneously — OEIF columns
are populated for older inspections, RC columns for post-Nov-2025 inspections.
File-level detection wrongly classified all schools based on column presence alone.
Replace _detect_framework(df) with _framework_for_row(row):
- ReportCard: any rc_* column has a value
- OEIF: overall_effectiveness or quality_of_education has a value
- None: neither has data (no graded inspection on record)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old OEIF CSV contains columns whose names include substrings like
'inclusion' and 'achievement', causing _detect_framework() to wrongly return
'ReportCard' for pre-Nov-2025 inspections.
Fix: check for OEIF-specific phrases first ('overall effectiveness', 'quality
of education', 'behaviour and attitudes'). Only if none are found, look for
multi-word RC-specific phrases. Default to OEIF as a safe fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ofsted replaced single overall grades with Report Cards from Nov 2025.
Both systems are retained during the transition period.
- DB: new framework + 9 RC columns on ofsted_inspections (schema v4)
- Integrator: auto-detect OEIF vs Report Card from CSV column headers;
parse 5-level RC grades and safeguarding met/not-met
- API: expose all new fields in the ofsted response dict
- Frontend: branch on framework='ReportCard' to show safeguarding badge
+ 8-category grid; fall back to legacy OEIF layout otherwise;
always show inspection date in both layouts
- CSS: rcGrade1–5 and safeguardingMet/NotMet classes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The EES statistics API only exposes ~13 publications; admissions data is not
among them. Switch to the EES content API (content.explore-education-statistics.
service.gov.uk) which covers all publications.
- ees.py: add get_content_release_id() and download_release_zip_csv() that
fetch the release ZIP and extract a named CSV member from it
- admissions.py: use corrected slug (primary-and-secondary-school-applications-
and-offers), correct column names from actual CSV (school_urn,
total_number_places_offered, times_put_as_1st_preference, etc.), derive
first_preference_offers_pct from offer/application ratio, filter to primary
schools only, keep most recent year per URN
Also includes SchoolDetailView UX redesign: parent-first section ordering,
plain-English labels, national average benchmarks, progress score colour
coding, expanded header, quick summary strip, and CSS consolidation.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ofsted renamed all columns in the OEIF framework:
- grades are now 'Latest OEIF overall effectiveness' etc.
- dates are 'Inspection start date of latest OEIF graded inspection'
Replace flat COLUMN_MAP with a priority list per field so both current
OEIF and legacy column names work without duplicate-column conflicts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kestra's HTTP client socket read timeout is shorter than any reasonable
wait for a full geocoded migration. POST /api/admin/reimport-ks2 returns
immediately with {status:started}; the backend runs the job in a thread.
Check GET /api/admin/reimport-ks2/status or watch the UI for schools.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ofsted CSV has a variable number of preamble rows (title, filter warning,
etc.) before the real column headers. Scan up to 10 rows to find the one
containing a URN column rather than assuming a fixed offset.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The geocoding pass over ~15k schools takes longer than any reasonable
HTTP timeout. New approach:
- POST /api/admin/reimport-ks2 starts migration in background thread,
returns {"status":"started"} immediately
- GET /api/admin/reimport-ks2/status returns {running, done}
- ks2.py polls status every 30s (max 2h) before returning
- Kestra flow timeout bumped to PT2H
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add geocode query param to /api/admin/reimport-ks2 (defaults true).
ks2.py passes ?geocode=true so postcodes are resolved to lat/lng in
the same migration pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Kestra requires retry.type to be set (e.g. constant, exponential).
Also rename delay -> interval which is the correct field for constant retry.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Waits up to 120s for /api/v1/flows/search to respond before attempting
imports, giving a clearer error if the URL is wrong or kestra isn't up.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- POST /api/v1/flows with Content-Type: application/x-yaml (not the
ZIP-based /import endpoint)
- On 409 (already exists), fall back to PUT /api/v1/flows/{ns}/{id}
so redeployment updates existing flows rather than failing
- Print HTTP response body on error for easier debugging
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
kestra/kestra:latest is ~500MB; the registry rejects the push.
The init container only needs to POST flow YAMLs to the Kestra REST API
(/api/v1/flows/import), which curl handles fine from a tiny alpine base.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bind mounts don't work on the remote Portainer host since the files
aren't present there. Instead, Dockerfile.init copies the flow YAMLs
into a dedicated image (kestra/kestra:latest base) that is built in CI
and pulled by Portainer like the other images.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- backend: POST /api/admin/reimport-ks2 runs full CSV migration in a thread
- backend/docker-compose: ADMIN_API_KEY env var (default: changeme) so the
key is stable across restarts and the integrator can call the endpoint
- integrator: sources/ks2.py triggers the backend endpoint (900s timeout)
- integrator: flows/ks2.yml Kestra flow (manual trigger, no schedule)
To re-ingest after a DB wipe: trigger the ks2-reimport flow from the
Kestra UI at http://localhost:8080.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a full data integration pipeline for enriching school profiles with
supplementary data from Ofsted, GIAS, EES, IDACI, and FBIT.
Backend:
- Bump SCHEMA_VERSION to 3; add 8 new DB tables (ofsted_inspections,
ofsted_parent_view, school_census, admissions, sen_detail, phonics,
school_deprivation, school_finance) plus GIAS columns on schools
- Expose all supplementary data via GET /api/schools/{urn}
- Enrich school list responses with ofsted_grade + ofsted_date
Integrator (new service):
- FastAPI HTTP microservice; Kestra calls POST /run/{source}
- 9 source modules: ofsted, gias, parent_view, census, admissions,
sen_detail, phonics, idaci, finance
- 9 Kestra flow YAMLs with scheduled triggers and 3× retry
Frontend:
- SchoolRow: colour-coded Ofsted badge (Outstanding/Good/RI/Inadequate)
- SchoolDetailView: 7 new sections — Ofsted sub-judgements, Parent View
survey bars, Admissions, Pupils & Inclusion / SEN, Phonics, Deprivation
Context, Finances
- types.ts: 8 new interfaces + extended School/SchoolDetailsResponse
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>