Data Engineer

Responsibilities:

Build and optimise scalable data pipelines using PySpark and SQL
Design and implement ETL/ELT processes for batch and streaming data
Develop data solutions using Databricks Lakehouse and Delta Lake
Ingest and integrate data from internal and external sources (e.g. Kafka, CDC)
Optimise Spark jobs and data workflows for performance, scalability, and cost efficiency
Manage infrastructure and environments using Terraform (IaC)
Ensure data quality, monitoring, and reliability
Implement governance and access controls (e.g. Unity Catalog)
Deliver clean, structured, and accessible data for analytics and business use
Collaborate with cross-functional teams to support analytics, reporting, and AI/ML initiatives

Qualifications:

Demonstrated experience in data engineering, with a proven ability to build scalable data solutions
Strong proficiency in Python and SQL
Hands-on experience with Apache Spark (including Structured Streaming)
Experience with Databricks (Workflows, Delta Live Tables, Lakehouse architecture)
Experience with cloud platforms (AWS, Azure, or GCP)
Experience with Terraform or similar infrastructure-as-code tools
Experience working with structured and semi-structured data (e.g. JSON)
Familiarity with CI/CD, modular development, and code documentation
Strong communication skills and ability to work independently with a high level of ownership
Preferred experience with Databricks certifications (Associate or Professional) and exposure to data tools such as Kafka, DBT, or similar technologies
Advantageous to have knowledge of Scala or other programming languages, as well as experience working in Agile development environments

Similar jobs