fix(ees-tap): filter out rows with null URN before emitting
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m47s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m47s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
The admissions school-level file contains some rows with null school_urn (LA/category aggregates that survive the geographic_level filter). These cause a not-null constraint violation at target-postgres. Drop any row where the URN column is null or empty before yielding records. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -95,6 +95,11 @@ class EESDatasetStream(Stream):
|
|||||||
if "geographic_level" in df.columns:
|
if "geographic_level" in df.columns:
|
||||||
df = df[df["geographic_level"] == "School"]
|
df = df[df["geographic_level"] == "School"]
|
||||||
|
|
||||||
|
# Drop rows with no URN (LA/category aggregates that slip through the level filter)
|
||||||
|
urn_col = self._urn_column
|
||||||
|
if urn_col in df.columns:
|
||||||
|
df = df[df[urn_col].notna() & (df[urn_col] != "")]
|
||||||
|
|
||||||
self.logger.info("Emitting %d school-level rows", len(df))
|
self.logger.info("Emitting %d school-level rows", len(df))
|
||||||
|
|
||||||
for _, row in df.iterrows():
|
for _, row in df.iterrows():
|
||||||
|
|||||||
Reference in New Issue
Block a user