feat: national average reference line now tracks per year on history chart

Previously the dashed reference line was a flat horizontal at the latest year's national average across all historical data, implying the national figure was constant. Now the backend returns per-year averages in `by_year` and the chart maps each data year to its own national average, so the reference line correctly reflects how the national picture changed over time (including COVID recovery dip/recovery). - backend: /api/national-averages now includes `by_year` list alongside existing `year`/`primary`/`secondary` latest-year snapshot - types: NationalAverages extended with `by_year: NationalAveragesYear[]` - PerformanceChart: accepts `nationalByYear` prop; builds per-year series aligned to school data years, falling back to scalar prop if absent - SchoolDetailView + SecondarySchoolDetailView: pass `nationalAvg.by_year` Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 13:55:14 +01:00
parent 23f881b797
commit a3cfffa4d0
5 changed files with 57 additions and 10 deletions
@@ -676,9 +676,6 @@ async def get_national_averages(request: Request):
    if df.empty:
        return {"primary": {}, "secondary": {}}

-    latest_year = int(df["year"].max())
-    df_latest = df[df["year"] == latest_year]
-
    ks2_metrics = [
        "rwm_expected_pct", "rwm_high_pct",
        "reading_expected_pct", "writing_expected_pct", "maths_expected_pct",
@@ -703,15 +700,30 @@ async def get_national_averages(request: Request):
                    out[col] = round(float(val.mean()), 2)
        return out

+    latest_year = int(df["year"].max())
+    df_latest = df[df["year"] == latest_year]
+
    # Primary: schools where KS2 data is non-null
    primary_df = df_latest[df_latest["rwm_expected_pct"].notna()]
    # Secondary: schools where KS4 data is non-null
    secondary_df = df_latest[df_latest["attainment_8_score"].notna()]

+    # Per-year averages for every year in the dataset (used by chart reference lines)
+    by_year = []
+    for yr in sorted(df["year"].dropna().unique()):
+        yr = int(yr)
+        df_yr = df[df["year"] == yr]
+        by_year.append({
+            "year": yr,
+            "primary": _means(df_yr[df_yr["rwm_expected_pct"].notna()], ks2_metrics),
+            "secondary": _means(df_yr[df_yr["attainment_8_score"].notna()], ks4_metrics),
+        })
+
    return {
        "year": latest_year,
        "primary": _means(primary_df, ks2_metrics),
        "secondary": _means(secondary_df, ks4_metrics),
+        "by_year": by_year,
    }