fix(pipeline): use to_date for DD-MM-YYYY GIAS dates, exclude EES models from daily DAG
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m4s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m30s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m4s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m30s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
GIAS CSV dates are DD-MM-YYYY format — use to_date() instead of cast(). Exclude int_ks2_with_lineage+ and int_ks4_with_lineage+ from daily DAG selector since they depend on EES data not yet loaded. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -79,7 +79,7 @@ print(f'Validation passed: {{count}} GIAS rows')
|
|||||||
|
|
||||||
dbt_build = BashOperator(
|
dbt_build = BashOperator(
|
||||||
task_id="dbt_build",
|
task_id="dbt_build",
|
||||||
bash_command=f"cd {PIPELINE_DIR}/transform && {DBT_BIN} build --profiles-dir . --target production --select stg_gias_establishments+ stg_gias_links+",
|
bash_command=f"cd {PIPELINE_DIR}/transform && {DBT_BIN} build --profiles-dir . --target production --select stg_gias_establishments+ stg_gias_links+ --exclude int_ks2_with_lineage+ int_ks4_with_lineage+",
|
||||||
)
|
)
|
||||||
|
|
||||||
geocode_new = BashOperator(
|
geocode_new = BashOperator(
|
||||||
|
|||||||
@@ -30,8 +30,8 @@ renamed as (
|
|||||||
"County (name)" as county,
|
"County (name)" as county,
|
||||||
"Postcode" as postcode,
|
"Postcode" as postcode,
|
||||||
"EstablishmentStatus (name)" as status,
|
"EstablishmentStatus (name)" as status,
|
||||||
case when "OpenDate" = '' then null else cast("OpenDate" as date) end as open_date,
|
case when "OpenDate" = '' then null else to_date("OpenDate", 'DD-MM-YYYY') end as open_date,
|
||||||
case when "CloseDate" = '' then null else cast("CloseDate" as date) end as close_date,
|
case when "CloseDate" = '' then null else to_date("CloseDate", 'DD-MM-YYYY') end as close_date,
|
||||||
"Trusts (name)" as academy_trust_name,
|
"Trusts (name)" as academy_trust_name,
|
||||||
cast(nullif("Trusts (code)", '') as integer) as academy_trust_uid,
|
cast(nullif("Trusts (code)", '') as integer) as academy_trust_uid,
|
||||||
"UrbanRural (name)" as urban_rural,
|
"UrbanRural (name)" as urban_rural,
|
||||||
|
|||||||
@@ -9,7 +9,7 @@ renamed as (
|
|||||||
cast("URN" as integer) as urn,
|
cast("URN" as integer) as urn,
|
||||||
cast("LinkURN" as integer) as linked_urn,
|
cast("LinkURN" as integer) as linked_urn,
|
||||||
"LinkType" as link_type,
|
"LinkType" as link_type,
|
||||||
case when "LinkEstablishedDate" = '' then null else cast("LinkEstablishedDate" as date) end as link_date
|
case when "LinkEstablishedDate" = '' then null else to_date("LinkEstablishedDate", 'DD-MM-YYYY') end as link_date
|
||||||
from source
|
from source
|
||||||
where "URN" is not null
|
where "URN" is not null
|
||||||
and "LinkURN" is not null
|
and "LinkURN" is not null
|
||||||
|
|||||||
Reference in New Issue
Block a user