feat(pipeline): implement parent-view, fbit, idaci Singer taps + align staging/mart models
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 34s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m5s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m6s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s

Port extraction logic from integrator scripts into Singer SDK taps:
- tap-uk-parent-view: scrapes Ofsted open data portal, parses survey responses (14 questions)
- tap-uk-fbit: queries FBIT API per-URN with rate limiting, computes per-pupil spend
- tap-uk-idaci: downloads IoD2019 XLSX, batch-resolves postcodes→LSOAs via postcodes.io

Update dbt models to match actual tap output schemas:
- stg_idaci now includes URN (tap does the postcode→LSOA→school join)
- stg_parent_view expanded from 8 to 13 question columns
- fact_deprivation simplified (no longer needs postcode→LSOA join in dbt)
- fact_parent_view expanded to include all 13 question metrics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-26 10:38:07 +00:00
parent 904093ea8a
commit 97d975114a
9 changed files with 360 additions and 60 deletions

View File

@@ -107,4 +107,7 @@ models:
tests: [not_null]
- name: fact_deprivation
description: IDACI deprivation index
description: IDACI deprivation index — one row per URN
columns:
- name: urn
tests: [not_null, unique]

View File

@@ -1,22 +1,9 @@
-- Mart: Deprivation index — one row per URN
-- Joins school postcode → LSOA → IDACI score
with school_postcodes as (
select
urn,
postcode
from {{ ref('stg_gias_establishments') }}
where status = 'Open'
and postcode is not null
)
-- Note: The join between postcode and LSOA requires a postcode-to-LSOA
-- lookup table. This will be populated by the geocode script or a seed.
-- For now, this model serves as a placeholder that will be completed
-- once the IDACI tap provides the postcode→LSOA mapping.
-- The IDACI tap already resolves postcode → LSOA → IoD2019 score per school.
select
i.lsoa_code,
i.idaci_score,
i.idaci_decile
from {{ ref('stg_idaci') }} i
urn,
lsoa_code,
idaci_score,
idaci_decile
from {{ ref('stg_idaci') }}

View File

@@ -6,10 +6,15 @@ select
total_responses,
q_happy_pct,
q_safe_pct,
q_progress_pct,
q_well_taught_pct,
q_well_led_pct,
q_behaviour_pct,
q_bullying_pct,
q_communication_pct,
q_progress_pct,
q_teaching_pct,
q_information_pct,
q_curriculum_pct,
q_future_pct,
q_leadership_pct,
q_wellbeing_pct,
q_recommend_pct
from {{ ref('stg_parent_view') }}

View File

@@ -67,3 +67,6 @@ sources:
- name: idaci
description: Income Deprivation Affecting Children Index lookups
columns:
- name: urn
tests: [not_null]

View File

@@ -1,4 +1,6 @@
-- Staging model: Income Deprivation Affecting Children Index
-- The IDACI tap resolves postcode → LSOA and joins to IoD2019 data,
-- so each row already has a URN.
with source as (
select * from {{ source('raw', 'idaci') }}
@@ -6,10 +8,12 @@ with source as (
renamed as (
select
cast(urn as integer) as urn,
lsoa_code,
cast(idaci_score as numeric) as idaci_score,
cast(idaci_decile as integer) as idaci_decile
from source
where urn is not null
)
select * from renamed

View File

@@ -1,4 +1,5 @@
-- Staging model: Ofsted Parent View survey responses
-- The tap computes positive percentages (Strongly agree + Agree) per question.
with source as (
select * from {{ source('raw', 'parent_view') }}
@@ -6,17 +7,22 @@ with source as (
renamed as (
select
cast(urn as integer) as urn,
cast(survey_date as date) as survey_date,
cast(total_responses as integer) as total_responses,
cast(q_happy_pct as numeric) as q_happy_pct,
cast(q_safe_pct as numeric) as q_safe_pct,
cast(q_progress_pct as numeric) as q_progress_pct,
cast(q_well_taught_pct as numeric) as q_well_taught_pct,
cast(q_well_led_pct as numeric) as q_well_led_pct,
cast(q_behaviour_pct as numeric) as q_behaviour_pct,
cast(q_bullying_pct as numeric) as q_bullying_pct,
cast(q_recommend_pct as numeric) as q_recommend_pct
cast(urn as integer) as urn,
cast(survey_date as date) as survey_date,
cast(total_responses as integer) as total_responses,
cast(q_happy_pct as numeric) as q_happy_pct,
cast(q_safe_pct as numeric) as q_safe_pct,
cast(q_behaviour_pct as numeric) as q_behaviour_pct,
cast(q_bullying_pct as numeric) as q_bullying_pct,
cast(q_communication_pct as numeric) as q_communication_pct,
cast(q_progress_pct as numeric) as q_progress_pct,
cast(q_teaching_pct as numeric) as q_teaching_pct,
cast(q_information_pct as numeric) as q_information_pct,
cast(q_curriculum_pct as numeric) as q_curriculum_pct,
cast(q_future_pct as numeric) as q_future_pct,
cast(q_leadership_pct as numeric) as q_leadership_pct,
cast(q_wellbeing_pct as numeric) as q_wellbeing_pct,
cast(q_recommend_pct as numeric) as q_recommend_pct
from source
where urn is not null
)