fix(ks2): fire-and-forget instead of polling to avoid socket timeout
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m8s
Build and Push Docker Images / Build Integrator (push) Successful in 58s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s

Kestra's HTTP client socket read timeout is shorter than any reasonable
wait for a full geocoded migration. POST /api/admin/reimport-ks2 returns
immediately with {status:started}; the backend runs the job in a thread.
Check GET /api/admin/reimport-ks2/status or watch the UI for schools.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-24 21:21:31 +00:00
parent 7f9c61d587
commit d00dc699cc
2 changed files with 12 additions and 29 deletions

View File

@@ -10,7 +10,7 @@ tasks:
uri: http://integrator:8001/run/ks2?action=load uri: http://integrator:8001/run/ks2?action=load
method: POST method: POST
allowFailed: false allowFailed: false
timeout: PT2H # polls backend every 30s; geocoding 15k schools takes up to 1h timeout: PT30S # fire-and-forget; backend runs migration in background
errors: errors:
- id: notify-failure - id: notify-failure

View File

@@ -10,13 +10,10 @@ The CSV files must already be present in the data volume under
/data/{year}/england_ks2final.csv /data/{year}/england_ks2final.csv
(populated at deploy time from the repo's data/ directory). (populated at deploy time from the repo's data/ directory).
""" """
import time
import requests import requests
from config import BACKEND_URL, ADMIN_API_KEY from config import BACKEND_URL, ADMIN_API_KEY
HEADERS = {"X-API-Key": ADMIN_API_KEY} HEADERS = {"X-API-Key": ADMIN_API_KEY}
POLL_INTERVAL = 30 # seconds between status checks
MAX_WAIT = 7200 # 2 hours
def download(): def download():
@@ -26,33 +23,19 @@ def download():
def load(): def load():
"""Trigger full KS2 re-import and poll until complete.""" """Trigger KS2 re-import on the backend and return immediately.
start_url = f"{BACKEND_URL}/api/admin/reimport-ks2?geocode=true"
status_url = f"{BACKEND_URL}/api/admin/reimport-ks2/status"
print(f"POST {start_url}") The migration (including geocoding) runs as a background thread on the
resp = requests.post(start_url, headers=HEADERS, timeout=30) backend and can take up to an hour. Poll GET /api/admin/reimport-ks2/status
to check progress, or simply wait for schools to appear in the UI.
"""
url = f"{BACKEND_URL}/api/admin/reimport-ks2?geocode=true"
print(f"POST {url}")
resp = requests.post(url, headers=HEADERS, timeout=30)
resp.raise_for_status() resp.raise_for_status()
print(f"Started: {resp.json()}") result = resp.json()
print(f"Result: {result}")
print(f"Polling {status_url} every {POLL_INTERVAL}s (max {MAX_WAIT // 60} min)...") return result
elapsed = 0
while elapsed < MAX_WAIT:
time.sleep(POLL_INTERVAL)
elapsed += POLL_INTERVAL
sr = requests.get(status_url, headers=HEADERS, timeout=15)
sr.raise_for_status()
state = sr.json()
print(f" [{elapsed // 60}m] {state}")
if state.get("done"):
print("Re-import complete.")
return state
if not state.get("running"):
raise RuntimeError(f"Re-import stopped unexpectedly: {state}")
raise TimeoutError(f"KS2 re-import did not complete within {MAX_WAIT // 60} minutes")
if __name__ == "__main__": if __name__ == "__main__":