Commit Graph

19 Commits

Author SHA1 Message Date
03cd1de6af fix(airflow): delete and reimport DAGs on init to clear stale task refs
Some checks failed
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 34s
Build and Push Docker Images / Build Integrator (push) Has been cancelled
Build and Push Docker Images / Build Kestra Init (push) Has been cancelled
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Has been cancelled
Build and Push Docker Images / Trigger Portainer Update (push) Has been cancelled
Build and Push Docker Images / Build Frontend (Next.js) (push) Has been cancelled
When tasks are removed from a DAG, old serialized metadata in the DB
causes 'Task not found' errors. Delete all DAGs before reserializing
on each deploy to ensure a clean state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 21:28:03 +00:00
72ef1b03b7 fix(airflow): use correct Airflow 3 env vars for multi-container JWT and Execution API
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m6s
Build and Push Docker Images / Build Integrator (push) Successful in 54s
Build and Push Docker Images / Build Kestra Init (push) Successful in 30s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 30s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 0s
Replace Airflow 2.x env vars (CORE__SECRET_KEY, CORE__INTERNAL_API_URL) with
correct Airflow 3.x equivalents (API_AUTH__JWT_SECRET, API_AUTH__JWT_ISSUER,
CORE__EXECUTION_API_SERVER_URL) on all three Airflow services.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 20:11:06 +00:00
ea160b53df fix(airflow): point scheduler to api-server via INTERNAL_API_URL
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 34s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m3s
Build and Push Docker Images / Build Integrator (push) Successful in 55s
Build and Push Docker Images / Build Kestra Init (push) Successful in 30s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 33s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
With separate containers, task workers in the scheduler need the
api-server's address for the Execution API. Defaults to localhost:8080
which fails across containers. Set INTERNAL_API_URL to the api-server's
Docker service name.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 17:09:17 +00:00
8a2503230f fix(airflow): split back to separate scheduler and api-server containers
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m1s
Build and Push Docker Images / Build Integrator (push) Successful in 55s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 29s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 0s
Running both in one container caused JWT secret key race conditions.
Separate containers with the same AIRFLOW__CORE__SECRET_KEY env var
ensures both processes use identical JWT signing keys. Shared
airflow_logs volume allows the api-server to read task logs written
by the scheduler.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 17:00:07 +00:00
677e80ad70 fix(airflow): generate config before starting processes, set fixed secret key
Some checks failed
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 31s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m3s
Build and Push Docker Images / Build Integrator (push) Successful in 54s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Has been cancelled
Build and Push Docker Images / Trigger Portainer Update (push) Has been cancelled
Build and Push Docker Images / Build Kestra Init (push) Has been cancelled
The init container and airflow container have separate filesystems, so
airflow.cfg generated by db migrate is not available to the scheduler/
api-server. Without a config file, both processes race to generate
their own with different random JWT secret keys.

Fix by:
1. Running `airflow config list` first to generate airflow.cfg once
2. Setting a fixed SECRET_KEY via env var (>= 64 bytes for SHA512)
3. Adding sleep 3 so scheduler writes config before api-server starts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 16:57:22 +00:00
1dbcc24434 fix(airflow): stop deleting airflow.cfg, let processes share config
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 31s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m2s
Build and Push Docker Images / Build Integrator (push) Successful in 54s
Build and Push Docker Images / Build Kestra Init (push) Successful in 30s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 30s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Deleting airflow.cfg at container start caused the scheduler and
api-server to each generate their own random JWT secret key, leading
to 'Signature verification failed' when task workers communicated
with the api-server. Let both processes share the config file
generated by db migrate (env vars still override where needed).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 16:49:18 +00:00
b3e4769d82 fix(airflow): set shared internal API secret key
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 30s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m2s
Build and Push Docker Images / Build Integrator (push) Successful in 55s
Build and Push Docker Images / Build Kestra Init (push) Successful in 30s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 30s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
When scheduler and api-server run in the same container, both generate
independent JWT signing keys on startup. The scheduler's task workers
then fail with 'Invalid auth token: Signature verification failed'
when communicating with the api-server. Fix by setting a shared
INTERNAL_API_SECRET_KEY via env var.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 16:42:02 +00:00
07869738c0 fix(airflow): merge scheduler and api-server into single container
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m6s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 0s
With LocalExecutor, tasks run in the scheduler process and logs are
written locally. Running api-server and scheduler in separate containers
meant the api-server couldn't read task logs (empty hostname in log
fetch URL). Combining them into one container eliminates the issue —
logs are always on the local filesystem.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 12:16:18 +00:00
a3a50cc8d2 fix(airflow): remove generated airflow.cfg so env vars take effect
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m7s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
airflow db migrate generates airflow.cfg with default values that
shadow our env vars (DAGS_FOLDER, WORKER_LOG_SERVER_HOST, etc).
Delete the generated config file before starting each service so
Airflow falls through to env var configuration exclusively.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 12:12:32 +00:00
2ba5e57286 fix(airflow): set scheduler hostname for log server resolution
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m11s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
The scheduler's log server binds to [::]:8793 but doesn't advertise a
hostname, so the api-server gets 'http://:8793/...' (no host) when
fetching task logs. Fix by setting the scheduler's hostname and
configuring WORKER_LOG_SERVER_HOST so the api-server can reach it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 12:06:22 +00:00
6b4eb08a5e fix(airflow): share logs volume between scheduler and api-server
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
The api-server couldn't fetch task logs because LocalExecutor runs tasks
in the scheduler process, writing logs to its local filesystem. The
api-server tried to fetch via HTTP but the scheduler's log server had
no hostname set. Fix by sharing a named volume for logs between both
containers so the api-server reads logs directly from the filesystem.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:55:43 +00:00
b6a487776b fix(airflow): set DAGS_FOLDER in image env and reserialize on init
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m5s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 0s
- Add AIRFLOW__CORE__DAGS_FOLDER env var in Dockerfile so it's always set
- Run `airflow dags reserialize` after `db migrate` in init container so
  DAGs appear immediately without waiting for scheduler scan interval

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 11:05:41 +00:00
904093ea8a fix(airflow): remove DAG volume mounts, use image-baked DAGs
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m10s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
The named volume was shadowing the DAGs built into the pipeline image
with an empty directory. DAGs now served directly from the image and
update on each CI build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:27:39 +00:00
c4e3b6a7e4 fix(typesense): use TCP check for healthcheck, no curl/wget available
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m5s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Typesense image has neither curl nor wget. Use bash /dev/tcp for a
simple port connectivity check instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:14:59 +00:00
09d704c325 fix(typesense): use wget instead of curl for healthcheck
Some checks failed
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 33s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m12s
Build and Push Docker Images / Build Kestra Init (push) Has been cancelled
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Has been cancelled
Build and Push Docker Images / Trigger Portainer Update (push) Has been cancelled
Build and Push Docker Images / Build Integrator (push) Has been cancelled
Typesense Docker image ships with wget but not curl.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:12:54 +00:00
1574089b95 fix(pipeline): update Airflow healthcheck to /api/v2/monitor/health
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m9s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Airflow 3 moved the health endpoint from /health to /api/v2/monitor/health.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 10:01:09 +00:00
a7904b627d fix(pipeline): migrate to Airflow 3 API server and SimpleAuthManager
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 34s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m12s
Build and Push Docker Images / Build Integrator (push) Successful in 58s
Build and Push Docker Images / Build Kestra Init (push) Successful in 31s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 31s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Airflow 3 replaced `airflow webserver` with `airflow api-server` and
removed the `airflow users` CLI. Auth is now via SimpleAuthManager
configured through AIRFLOW__CORE__SIMPLE_AUTH_MANAGER_USERS env var.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 09:32:08 +00:00
deb4024731 chore(pipeline): bump all dependencies to latest stable versions
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 32s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m4s
Build and Push Docker Images / Build Integrator (push) Successful in 57s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Build Pipeline (Meltano + dbt + Airflow) (push) Successful in 1m45s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 0s
- Airflow 2.11 → 3.1 (BashOperator moved to providers-standard)
- Meltano 3.5 → 4.1 (meltano.yml version 2, meltanolabs target-postgres)
- dbt-postgres 1.9 → 1.10
- singer-sdk 0.39 → 0.53 (all 6 taps)
- Typesense Docker 27.1 → 30.1
- Typesense Python client >=2.0
- Python base image 3.12 → 3.13

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 09:18:11 +00:00
8f02b5125e feat(pipeline): add Meltano + dbt + Airflow ELT pipeline scaffold
All checks were successful
Build and Push Docker Images / Build Backend (FastAPI) (push) Successful in 35s
Build and Push Docker Images / Build Frontend (Next.js) (push) Successful in 1m9s
Build and Push Docker Images / Build Integrator (push) Successful in 56s
Build and Push Docker Images / Build Kestra Init (push) Successful in 32s
Build and Push Docker Images / Trigger Portainer Update (push) Successful in 1s
Replaces the hand-rolled integrator with a production-grade ELT pipeline
using Meltano (Singer taps), dbt Core (medallion architecture), and
Apache Airflow (orchestration). Adds Typesense for search and PostGIS
for geospatial queries.

- 6 custom Singer taps (GIAS, EES, Ofsted, Parent View, FBIT, IDACI)
- dbt project: 12 staging, 5 intermediate, 12 mart models
- 3 Airflow DAGs (daily/monthly/annual schedules)
- Typesense sync + batch geocoding scripts
- docker-compose: add Airflow, Typesense; upgrade to PostGIS
- Portainer stack definition matching live deployment topology

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-26 08:37:53 +00:00