feat(pipeline): add Meltano + dbt + Airflow ELT pipeline scaffold
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 35s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m9s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 35s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m9s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Replaces the hand-rolled integrator with a production-grade ELT pipeline using Meltano (Singer taps), dbt Core (medallion architecture), and Apache Airflow (orchestration). Adds Typesense for search and PostGIS for geospatial queries. - 6 custom Singer taps (GIAS, EES, Ofsted, Parent View, FBIT, IDACI) - dbt project: 12 staging, 5 intermediate, 12 mart models - 3 Airflow DAGs (daily/monthly/annual schedules) - Typesense sync + batch geocoding scripts - docker-compose: add Airflow, Typesense; upgrade to PostGIS - Portainer stack definition matching live deployment topology Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
16
pipeline/plugins/extractors/tap-uk-fbit/pyproject.toml
Normal file
16
pipeline/plugins/extractors/tap-uk-fbit/pyproject.toml
Normal file
@@ -0,0 +1,16 @@
|
||||
[build-system]
|
||||
requires = ["setuptools>=68", "wheel"]
|
||||
build-backend = "setuptools.build_meta"
|
||||
|
||||
[project]
|
||||
name = "tap-uk-fbit"
|
||||
version = "0.1.0"
|
||||
description = "Singer tap for UK FBIT (Financial Benchmarking and Insights Tool)"
|
||||
requires-python = ">=3.10"
|
||||
dependencies = [
|
||||
"singer-sdk~=0.39",
|
||||
"requests>=2.31",
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
tap-uk-fbit = "tap_uk_fbit.tap:TapUKFBIT.cli"
|
||||
@@ -0,0 +1 @@
|
||||
"""tap-uk-fbit: Singer tap for Financial Benchmarking and Insights Tool API."""
|
||||
53
pipeline/plugins/extractors/tap-uk-fbit/tap_uk_fbit/tap.py
Normal file
53
pipeline/plugins/extractors/tap-uk-fbit/tap_uk_fbit/tap.py
Normal file
@@ -0,0 +1,53 @@
|
||||
"""FBIT Singer tap — extracts financial data from the FBIT REST API."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from singer_sdk import Stream, Tap
|
||||
from singer_sdk import typing as th
|
||||
|
||||
|
||||
class FBITFinanceStream(Stream):
|
||||
"""Stream: School financial benchmarking data."""
|
||||
|
||||
name = "fbit_finance"
|
||||
primary_keys = ["urn", "year"]
|
||||
replication_key = None
|
||||
|
||||
schema = th.PropertiesList(
|
||||
th.Property("urn", th.IntegerType, required=True),
|
||||
th.Property("year", th.IntegerType, required=True),
|
||||
th.Property("per_pupil_spend", th.NumberType),
|
||||
th.Property("staff_cost_pct", th.NumberType),
|
||||
th.Property("teacher_cost_pct", th.NumberType),
|
||||
th.Property("support_staff_cost_pct", th.NumberType),
|
||||
th.Property("premises_cost_pct", th.NumberType),
|
||||
).to_dict()
|
||||
|
||||
def get_records(self, context):
|
||||
# TODO: Implement FBIT API extraction
|
||||
# The FBIT API requires per-URN requests with rate limiting.
|
||||
# Implementation will batch URNs from dim_school and request
|
||||
# financial data for each.
|
||||
self.logger.warning("FBIT extraction not yet implemented")
|
||||
return iter([])
|
||||
|
||||
|
||||
class TapUKFBIT(Tap):
|
||||
"""Singer tap for UK FBIT financial data."""
|
||||
|
||||
name = "tap-uk-fbit"
|
||||
|
||||
config_jsonschema = th.PropertiesList(
|
||||
th.Property(
|
||||
"base_url",
|
||||
th.StringType,
|
||||
default="https://financial-benchmarking-and-insights-tool.education.gov.uk/api",
|
||||
),
|
||||
).to_dict()
|
||||
|
||||
def discover_streams(self):
|
||||
return [FBITFinanceStream(self)]
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
TapUKFBIT.cli()
|
||||
Reference in New Issue
Block a user