Commit Graph

5 Commits

Author SHA1 Message Date
Tudor Sitaru f05bbba613 perf: resolve all P1–P5 performance issues from code review
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 21s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 50s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 12s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
P1 (backend/data_loader.py): Add load_latest_school_data() which pre-computes
the one-row-per-school latest-year snapshot (groupby, prev-year trend merge)
once at startup instead of on every /api/schools request. get_schools route
now starts from the cached snapshot rather than rebuilding it.

S3 (backend/app.py): Wrap synchronous geocode_single_postcode() call in
asyncio.to_thread() so postcode lookups no longer block the uvicorn event
loop. Admin reload endpoint also uses to_thread for both cache primes.

P2 (nextjs-app/components/HomeView.tsx): Add mapParamsRef guard so switching
back to map view does not re-fetch 500 schools when search params haven't
changed. Reset ref on new searches so fresh data is always fetched.

P3 (nextjs-app/lib/chartSetup.ts): Extract Chart.js registration into a
shared side-effect module. ComparisonChart and PerformanceChart now import
it instead of each calling ChartJS.register() independently.

P4 (backend/database.py): Remove unnecessary db.commit() from the read-only
get_db_session() context manager — saves a DB round-trip on every request.

P5 (backend/database.py): Add pool_recycle=1800 to SQLAlchemy engine to
prevent stale TCP connections from accumulating in long-running processes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 22:45:46 +01:00
tudor ca351e9d73 feat: migrate backend to marts schema, update EES tap for verified datasets
Pipeline:
- EES tap: split KS4 into performance + info streams, fix admissions filename
  (SchoolLevel keyword match), fix census filename (yearly suffix), remove
  phonics (no school-level data on EES), change endswith → in for matching
- stg_ees_ks4: rewrite to filter long-format data and extract Attainment 8,
  Progress 8, EBacc, English/Maths metrics; join KS4 info for context
- stg_ees_admissions: map real CSV columns (total_number_places_offered, etc.)
- stg_ees_census: update source reference, stub with TODO for data columns
- Remove stg_ees_phonics, fact_phonics (no school-level EES data)
- Add ees_ks4_performance + ees_ks4_info sources, remove ees_ks4 + ees_phonics
- Update int_ks4_with_lineage + fact_ks4_performance with new KS4 columns
- Annual EES DAG: remove stg_ees_phonics+ from selector

Backend:
- models.py: replace all models to point at marts.* tables with schema='marts'
  (DimSchool, DimLocation, KS2Performance, FactOfstedInspection, etc.)
- data_loader.py: rewrite load_school_data_as_dataframe() using raw SQL joining
  dim_school + dim_location + fact_ks2_performance; update get_supplementary_data()
- database.py: remove migration machinery, keep only connection setup
- app.py: remove check_and_migrate_if_needed, remove /api/admin/reimport-ks2
  endpoints (pipeline handles all imports)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 09:29:27 +00:00
tudor d1d994c1a2 fix(startup): stop re-migrating on every container restart
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 48s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m13s
Build and Push Docker Images / Build Integrator (push) Successful in 58s
Build and Push Docker Images / Build Kestra Init (push) Successful in 35s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Two issues caused the backend to drop and reimport school data on restart:

1. schema_version table was in the drop list inside run_full_migration(),
   so after any migration the breadcrumb was destroyed and the next
   restart would see no version → re-trigger migration
2. Schema version was set after migration, so a crash mid-migration
   left no version → infinite re-migration loop

Fix: remove schema_version from the drop list, and set the version
before running migration so crashes don't cause loops.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-25 10:57:01 +00:00
Tudor f4919db3b9 Add automatic schema versioning with startup migration
Build and Push Docker Image / build-and-push (push) Successful in 57s
On startup, the app now checks if the database schema version matches
the code. If there's a mismatch or no version exists, it automatically
runs a full data migration before starting.

- Add backend/version.py with SCHEMA_VERSION constant
- Add backend/migration.py with extracted migration logic
- Add SchemaVersion model to track DB version
- Add version check functions to database.py
- Update app.py lifespan to use check_and_migrate_if_needed()
- Simplify migrate_csv_to_db.py to use shared logic

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 10:23:02 +00:00
Tudor Sitaru 4668e19c45 database 2
Build and Push Docker Image / build-and-push (push) Failing after 33s
2026-01-06 17:22:39 +00:00