status: available experience 6+ years scale 500M+ records/day base Boston, MA
— UTC
Abhijit Kunjiraman  ·  Senior Data Engineer

I turn business context into data products that ship.

I build data platforms by first understanding the business process, the source-system behavior, and the product or reporting feature that depends on the data. Then I design pipelines, models, APIs, and release workflows that make those features faster to build, safer to change, and easier to trust in production.

Current role
ENGIE Impact · Boston
Business domains
Billing · carbon · reporting
Core platform
Snowflake · AWS · Azure · dbt
Best fit
Senior Data Engineer roles
01 / impact

Business outcomes created by data platforms, not just pipelines.

Daily data scale
500M+
records powering analytics
Release effort reduced
97%
smaller, reviewable deploys
Backfill speed improved
−71%
faster recovery and replay
Tooling cost avoided
$50K
direct Snowflake sourcing
↳ operating model

I do not start with the pipeline. I start with the feature, report, workflow, or business decision the data is supposed to support. That changes the engineering conversation from "move these rows" to "make this capability reliable enough for people to build on."

/ business

Understand the workflow first

I clarify what the business is trying to accomplish, who consumes the data, which decisions or product features depend on it, and what correctness means in that context.

Examples · billing rules · carbon calculations · client reporting · operational analytics
/ data

Model how the data behaves

I look for the hard parts early: grain, keys, deletes, late-arriving updates, source-system ownership, historical corrections, and the difference between what a field is called and what it actually means.

Patterns · CDC · slowly changing data · auditability · reconciliation · contracts
/ platform

Build reusable foundations

I turn repeated asks into shared data products: governed warehouse layers, enrichment APIs, reusable validation, deployment automation, and pipeline templates that other teams can build from.

Outputs · curated models · APIs · batch services · deployment workflows · backfills
/ delivery

Ship for feature velocity

The goal is not only to load data successfully. The goal is to help product, analytics, engineering, and operations teams launch features faster because the data layer is understandable, tested, and dependable.

Result · fewer one-off queries · faster releases · clearer ownership · safer change
↳ systems that enabled product work

These are not diagrams for the sake of architecture. Each system exists to make a business capability easier to ship: carbon and consumption analytics, governed warehouse models, reusable enrichment, reporting, and operational workflows that need trustworthy data at scale.

Ellipse AWS · ingestion and carbon API platform
2023 – 2024shipped
SOURCE LANDING EVENT BUS COMPUTE STORAGE / API CONSUMERS Snowflake parameterized export S3 buckets raw landing EventBridge SQS queues Step Functions Lambdas Glue (ETL) Redis cache PostgreSQL InfluxDB S3 (consolidated) Carbon API λ-backed Power BI dashboards client apps
Pattern: Snowflake exports land in S3, EventBridge routes file events into SQS, and Lambdas with Step Functions coordinate downstream processing through Glue. Curated outputs land in PostgreSQL, InfluxDB, and consolidated S3, with a Lambda-backed Carbon API serving derived metrics to reporting tools.
Analytical Data Processing Azure and Snowflake · ELT warehouse
2024 – presentin production
ON-PREM EXTRACT RAW BLOB PROCESSED BLOB SNOWFLAKE CONSUMERS MS SQL Server Oracle flat-file feeds Azure Data Factory extract + processed Azure Blob raw landing Azure Blob processed (ADF) SNOWFLAKE RAW (ext. stage) STAGE DIM FACT REST API views analytics Power BI
Pattern: Azure Data Factory orchestrates extracts from on-prem sources into raw Azure Blob, applies processing into a curated Blob layer, and makes the data available to Snowflake through external stages. Snowflake then loads RAW, Stage, Dim, and Fact layers for APIs, reporting views, analytics, and Power BI.
02 / selected work

Selected work where business context shaped the engineering design.

Aug 2023 — present Boston, MA

ENGIE Impact

Senior Data Engineer · DEA team

At ENGIE Impact, my work sits between business process, product delivery, and platform engineering. The systems I build support billing, carbon accounting, consumption analytics, reporting, and internal data products, so the design work is not only about throughput. It is about understanding how the data is used, where definitions can drift, and how to make changes safe when product requirements evolve.

  • Served as the accountable engineering owner for core data delivery, converting work previously split across an outsourced 7-person team into a focused internal ownership model with clearer requirements, priorities, and production accountability.
  • Built and operated an event-driven AWS ingestion platform processing 500M+ records/day into PostgreSQL and InfluxDB, supporting billing, reporting, carbon analytics, and client-facing dashboards used across enterprise accounts.
  • Partnered with product and engineering stakeholders to translate feature needs into data contracts, source-to-target logic, validation rules, and release plans, reducing ambiguity before code was written.
  • Replaced legacy "deploy-everything" Snowflake releases with a manifest-driven, change-only CI/CD process, reducing release SQL from ~95K lines to 1.5K–3K lines across 247 files and making data-platform changes easier to review and ship.
  • Proposed and designed direct Snowflake sourcing, helping retire Matillion for $40K–$50K in annual savings while moving teams from rigid once-daily extracts toward more flexible, on-demand access patterns.
  • Designed a CDC-style ingestion pattern with hash-based change detection, reducing reprocessing time from 10.5 hours to 3 hours and making backfills, corrections, and late-arriving updates safer to run.
  • Productized FX and geography enrichment through batch REST and GraphQL APIs, turning repeated lookup logic into a shared internal service that product teams could reuse instead of rebuilding one-off Snowflake queries.
  • Modeled Snowflake external stages on Azure Blob with a RAW contract view layer, then used dbt to build Stage, Dim, and Fact models that gave analytics and downstream services a consistent source of truth.
  • Implemented Kafka and Spark ingestion with schema validation, deduplication, and idempotent writes, improving end-to-end performance by 30–50% while keeping failure recovery predictable.
  • Unblocked long-running initiatives, including an RDS migration and Unified UOM workstream, by tracing issues across data flows, source behavior, model assumptions, and ownership gaps until the path to release was clear.
SnowflakedbtAWSAzure DevOps KafkaSparkTerraformPython PostgreSQLLiquibase
Sep 2019 — Sep 2021 India

Technocrat Services

Data Engineer

Earlier in my career, I focused on building the fundamentals: dependable ETL, dimensional models, and cloud data processing that gave analysts and operations teams better access to business metrics.

  • Migrated sales ETL workloads from Amazon Redshift to PostgreSQL by redesigning extraction workflows, improving query performance by ~50% and cutting pipeline runtime by ~60%.
  • Built Airflow, Spark, and SQL pipelines across S3, Athena, Glue, and Redshift, partnering with analysts to model fact and dimension tables that improved operational reporting and reduced waste by ~6%.
AirflowSparkRedshiftPostgreSQL S3AthenaGlue
03 / capabilities

Where I add the most value across product and data teams.

/ 01

Business and data translation

I turn vague requirements into concrete data questions: what is the grain, what is the business definition, who owns the source, what changes over time, and what needs to be true for the feature to be trusted?

Strengths · requirements · source analysis · definitions · stakeholder alignment
/ 02

Product-ready data modeling

I design Snowflake and warehouse layers that are usable by applications, APIs, dashboards, and analysts, not just technically normalized. The goal is clear contracts, consistent naming, and models that match how the business works.

Snowflake · dbt · RAW · Stage · Dim · Fact · semantic consistency
/ 03

Incremental ingestion and change handling

I build pipelines that understand change: updates, deletes, late data, replays, reprocessing, and corrections. This is what allows product teams to keep iterating without turning every data change into a production incident.

Patterns · CDC · hashing · dedupe · idempotency · backfills · reconciliation
/ 04

Reusable data services

When multiple teams repeat the same lookup, calculation, or enrichment, I prefer turning it into a shared service or governed model. That gives feature teams one dependable path instead of many inconsistent implementations.

APIs · REST · GraphQL · enrichment · reference data · batch services
/ 05

Quality, auditability, and recovery

I design validation, quarantine paths, reason-coded errors, logging, and replay behavior into the system from the beginning so teams can explain what happened, fix the data, and recover without guessing.

Controls · Pydantic · validation · observability · lineage · operational support
/ 06

Release engineering for data platforms

I make data changes easier to ship by bringing software-engineering discipline into database deployments: manifests, smaller diffs, environment promotion, validation gates, and CI/CD workflows that reduce release risk.

Delivery · Azure DevOps · Terraform · Liquibase · Jenkins · Git · Python tooling
The best data platform is not just stable.
It makes the next feature, report, API, or decision easier to ship.
— engineering principle
04 / contact

Open to senior data engineering roles where data platform work directly supports product, analytics, and business outcomes.