fix(tap-gias): declare numeric CSV columns as StringType
Some checks failed
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 35s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m7s
Build and Push Docker Images / Build Integrator (push) Failing after 30s
Build and Push Docker Images / Build Kestra Init (push) Failing after 30s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Failing after 29s
Build and Push Docker Images / Trigger Portainer Update (push) Has been skipped

CSV is read with dtype=str so all values arrive as strings. Declaring
LA (code) and EstablishmentNumber as IntegerType caused schema
validation failures in target-postgres. Use StringType for all columns
except URN (which is explicitly cast to int for the primary key).
Type casting happens in dbt staging models.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-26 14:03:26 +00:00
parent 84261f6125
commit 0062a5eabe

View File

@@ -26,14 +26,16 @@ class GIASEstablishmentsStream(Stream):
replication_key = None replication_key = None
# Schema is wide (~250 columns); we declare key columns and pass through the rest # Schema is wide (~250 columns); we declare key columns and pass through the rest
# All columns are read as strings from CSV; dbt staging models handle type casting.
# Only URN is cast to int in get_records() for the primary key.
schema = th.PropertiesList( schema = th.PropertiesList(
th.Property("URN", th.IntegerType, required=True), th.Property("URN", th.IntegerType, required=True),
th.Property("EstablishmentName", th.StringType), th.Property("EstablishmentName", th.StringType),
th.Property("TypeOfEstablishment (name)", th.StringType), th.Property("TypeOfEstablishment (name)", th.StringType),
th.Property("PhaseOfEducation (name)", th.StringType), th.Property("PhaseOfEducation (name)", th.StringType),
th.Property("LA (code)", th.IntegerType), th.Property("LA (code)", th.StringType),
th.Property("LA (name)", th.StringType), th.Property("LA (name)", th.StringType),
th.Property("EstablishmentNumber", th.IntegerType), th.Property("EstablishmentNumber", th.StringType),
th.Property("EstablishmentStatus (name)", th.StringType), th.Property("EstablishmentStatus (name)", th.StringType),
th.Property("Postcode", th.StringType), th.Property("Postcode", th.StringType),
).to_dict() ).to_dict()