Files
school_compare/pipeline/transform/macros/chain_lineage.sql
Tudor 8f02b5125e
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 35s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m9s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
feat(pipeline): add Meltano + dbt + Airflow ELT pipeline scaffold
Replaces the hand-rolled integrator with a production-grade ELT pipeline
using Meltano (Singer taps), dbt Core (medallion architecture), and
Apache Airflow (orchestration). Adds Typesense for search and PostGIS
for geospatial queries.

- 6 custom Singer taps (GIAS, EES, Ofsted, Parent View, FBIT, IDACI)
- dbt project: 12 staging, 5 intermediate, 12 mart models
- 3 Airflow DAGs (daily/monthly/annual schedules)
- Typesense sync + batch geocoding scripts
- docker-compose: add Airflow, Typesense; upgrade to PostGIS
- Portainer stack definition matching live deployment topology

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 08:37:53 +00:00

37 lines
870 B
SQL

-- Macro: Generate a CTE that unions current and predecessor data for a given source
{% macro chain_lineage(source_ref, urn_col='urn', year_col='year') %}
with current_data as (
select
{{ urn_col }} as current_urn,
{{ urn_col }} as source_urn,
*
from {{ source_ref }}
),
predecessor_data as (
select
lin.current_urn,
src.{{ urn_col }} as source_urn,
src.*
from {{ source_ref }} src
inner join {{ ref('int_school_lineage') }} lin
on src.{{ urn_col }} = lin.predecessor_urn
where not exists (
select 1 from {{ source_ref }} curr
where curr.{{ urn_col }} = lin.current_urn
and curr.{{ year_col }} = src.{{ year_col }}
)
),
combined as (
select * from current_data
union all
select * from predecessor_data
)
select * from combined
{% endmacro %}