feat(integrator): add KS2 re-import via Kestra and backend admin endpoint

- backend: POST /api/admin/reimport-ks2 runs full CSV migration in a thread - backend/docker-compose: ADMIN_API_KEY env var (default: changeme) so the key is stable across restarts and the integrator can call the endpoint - integrator: sources/ks2.py triggers the backend endpoint (900s timeout) - integrator: flows/ks2.yml Kestra flow (manual trigger, no schedule) To re-ingest after a DB wipe: trigger the ks2-reimport flow from the Kestra UI at http://localhost:8080. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-24 12:25:29 +00:00
parent 822ec936bf
commit f1fb847164
5 changed files with 98 additions and 0 deletions
@@ -0,0 +1,47 @@
+"""
+KS2 attainment data re-importer.
+
+Triggers a full re-import of the KS2 CSV data by calling the backend's
+admin endpoint. The backend owns the migration logic and CSV column mappings;
+this module is a thin trigger so the re-import can be orchestrated via Kestra
+like all other data sources.
+
+The CSV files must already be present in the data volume under
+  /data/{year}/england_ks2final.csv
+(populated at deploy time from the repo's data/ directory).
+"""
+import sys
+import requests
+from config import BACKEND_URL, ADMIN_API_KEY
+
+
+def download():
+    """No download step — CSVs are shipped with the repo."""
+    print("KS2 CSVs are bundled in the data volume; no download needed.")
+    return {"skipped": True}
+
+
+def load():
+    """Trigger full KS2 re-import via the backend admin endpoint."""
+    url = f"{BACKEND_URL}/api/admin/reimport-ks2"
+    print(f"POST {url}")
+    resp = requests.post(
+        url,
+        headers={"X-API-Key": ADMIN_API_KEY},
+        timeout=900,  # migration can take ~10 minutes
+    )
+    resp.raise_for_status()
+    result = resp.json()
+    print(f"Result: {result}")
+    return result
+
+
+if __name__ == "__main__":
+    import argparse
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--action", choices=["download", "load", "all"], default="all")
+    args = parser.parse_args()
+    if args.action in ("download", "all"):
+        download()
+    if args.action in ("load", "all"):
+        load()