Data Engineer

Responsibilities

  • Build and maintain ETL/ELT pipelines in Microsoft Fabric Notebooks (Python/PySpark) ingesting data from multiple source systems via REST and SOAP APIs

  • Implement Bronze layer landing (raw data, no transformation), Silver layer cleansing/typing/deduplication, and Gold layer aggression for business analytics

  • Design and build custom API connectors with OAuth 2.0/Bearer token authentication, incremental sync, pagination handling, rate-limit/retry logic and error recovery

  • Configure and manage Fabric Workspaces across Dev/Test/Production environments using Fabric Deployment Pipelines and Git integration for CI/CD

  • Build and maintain Power BI semantic models (star/snowflake schemas) supporting operational reporting and analytics dashboards

  • Implement row-level security (RLS) using Azure AD RBAC, warehouse-level DAX filters, and dynamic RLS platforms

  • Set up monitoring, alerting, and telemetry using Azure Monitor, Log Analytics, and Application insights to track pipeline health and data freshness

  • Manage API credentials and secrets via Azure Key Vault with automated rotation policies

  • Embed within the client's development process, working directly with business stakeholders to gather requirements and co-develop solutions

  • Contribute to data governance: data dictionary creation, lineage tracking, and documentation of transformation logic across all pipeline stages

  • Requirements

  • 5+ years experience in data engineering or a similar role in a commercial environment

  • Hands-on experience with Microsoft Fabric (Lakehouses, Notebooks, Data Pipelines, OneLake, SQL Analytics Endpoints) or equivalent depth in Azure Synapse Analytics

  • Advanced SQL for data transformation, performance tuning, and Delta Lake table management

  • Proficient in Python/PySpark for data processing, API integration, and pipeline automation

  • Proven experience building custom API connectors (REST, SOAP/XML) with OAuth 2.0, pagination, rate limiting, and incremental sync patterns

  • Strong understanding of Medallion Architecture, data lakehouse concepts, and Delta Lake (merge, upsert, soft-delete handling, time travel)

  • Experience with dimensional modelling (star/snowflake schemas) and Power BI semantic model development

  • Working knowledge of Azure services: Key Vault, Azure Monitor, Log Analytics, Azure AD/Entra ID

  • Experience translating business requirements into dimensional models iteratively (not just building pre-designed schemas)

  • Comfort operating autonomously with minimal supervision in a client-embedded model

  • Experience implementing row-level security in Power BI (DAX-based RLS, dynamic security models)

  • Familiarity with CI/CD in a Fabric context: Deployment Pipelines, Git integration, environment promotion workflows

  • Experience with semi-structured data formats (JSON, XML) and handling schema evolution across pipeline layers

  • Preferred

  • Prior work in retail, wholesale, distribution, or fresh produce / supply chain data environments

  • Experience with ERP data extraction and replication tools in a Fabric or Azure context

  • Knowledge of dbt, Power Query, or similar transformation frameworks

  • Relevant Azure or Microsoft certifications (DP-600, DP-203, PL-300)

  • Openness to AI coding assistants (GitHub Copilot, Cursor, Claude) as part of your development workflow

  • What We Offer

  • Competitive salary with mentorship and career growth opportunities

  • Hands-on work with cloud-first technologies (Microsoft Fabric, Azure, Power BI)

  • Long-term embedded role with a single client, providing deep domain expertise and continuity

  • Fast-paced, innovative environment where AI-augmented development is the norm

  • Similar jobs