ENGIE Impact
At ENGIE Impact, my work sits between business process, product delivery, and platform engineering. The systems I build support billing, carbon accounting, consumption analytics, reporting, and internal data products, so the design work is not only about throughput. It is about understanding how the data is used, where definitions can drift, and how to make changes safe when product requirements evolve.
- Served as the accountable engineering owner for core data delivery, converting work previously split across an outsourced 7-person team into a focused internal ownership model with clearer requirements, priorities, and production accountability.
- Built and operated an event-driven AWS ingestion platform processing 500M+ records/day into PostgreSQL and InfluxDB, supporting billing, reporting, carbon analytics, and client-facing dashboards used across enterprise accounts.
- Partnered with product and engineering stakeholders to translate feature needs into data contracts, source-to-target logic, validation rules, and release plans, reducing ambiguity before code was written.
- Replaced legacy "deploy-everything" Snowflake releases with a manifest-driven, change-only CI/CD process, reducing release SQL from ~95K lines to 1.5K–3K lines across 247 files and making data-platform changes easier to review and ship.
- Proposed and designed direct Snowflake sourcing, helping retire Matillion for $40K–$50K in annual savings while moving teams from rigid once-daily extracts toward more flexible, on-demand access patterns.
- Designed a CDC-style ingestion pattern with hash-based change detection, reducing reprocessing time from 10.5 hours to 3 hours and making backfills, corrections, and late-arriving updates safer to run.
- Productized FX and geography enrichment through batch REST and GraphQL APIs, turning repeated lookup logic into a shared internal service that product teams could reuse instead of rebuilding one-off Snowflake queries.
- Modeled Snowflake external stages on Azure Blob with a RAW contract view layer, then used dbt to build Stage, Dim, and Fact models that gave analytics and downstream services a consistent source of truth.
- Implemented Kafka and Spark ingestion with schema validation, deduplication, and idempotent writes, improving end-to-end performance by 30–50% while keeping failure recovery predictable.
- Unblocked long-running initiatives, including an RDS migration and Unified UOM workstream, by tracing issues across data flows, source behavior, model assumptions, and ownership gaps until the path to release was clear.