Senior Databricks Engineer

Bounteous is a premier end-to-end digital transformation consultancy dedicated to partnering with ambitious brands to create digital solutions for today’s complex challenges and tomorrow’s opportunities. With uncompromising standards for technical and domain expertise, we deliver innovative and strategic solutions in Strategy, Analytics, Digital Engineering, Cloud, Data & AI, Experience Design, and Marketing.

Our Co-Innovation methodology is a unique engagement model designed to align interests and accelerate value creation. Our clients worldwide benefit from the skills and expertise of over 4,000+ expert team members across the Americas, APAC, and EMEA. By partnering with leading technology providers, we craft transformative digital experiences that enhance customer engagement and drive business success.

We are seeking a Senior Databricks Engineer to design, build, and optimize large-scale data and analytics platforms on the Databricks Lakehouse. In this role you will own the architecture and delivery of production-grade data pipelines, partner with analytics and data science teams, and set engineering standards for performance, reliability, and cost efficiency across our cloud data estate.

Information Security Responsibilities

Promote and enforce awareness of key information security practices, including acceptable use of information assets, malware protection, and password security protocols

Identify, assess, and report security risks, focusing on how these risks impact the confidentiality, integrity, and availability of information assets

Understand and evaluate how data is stored, processed, or transmitted, ensuring compliance with data privacy and protection standards (GDPR, CCPA, etc.)

Ensure data protection measures are integrated throughout the information lifecycle to safeguard sensitive information

Role and Responsibilities:

Architect, build, and maintain scalable ETL/ELT pipelines on the Databricks Lakehouse Platform using PySpark, Spark SQL, and Delta Lake.

Design and implement medallion (bronze/silver/gold) data architectures and enforce data quality, governance, and lineage standards.

Optimize Spark jobs and cluster configurations for performance and cost, including partitioning, caching, and autoscaling strategies.

Implement and manage Unity Catalog for access control, data governance, and cross-workspace asset sharing.

Build and orchestrate workflows using Databricks Workflows, Delta Live Tables, and CI/CD pipelines.

Collaborate with data scientists, analysts, and business stakeholders to translate requirements into reliable data products.

Establish engineering best practices, conduct code reviews, and mentor junior data engineers.

Monitor production pipelines, troubleshoot failures, and drive root-cause analysis and continuous improvement.

Required Qualifications:

5+ years of data engineering experience, with 3+ years building production solutions on Databricks and Apache Spark.

Expert proficiency in Python (PySpark) and advanced SQL.

Deep hands-on experience with Delta Lake, Unity Catalog, and the medallion architecture pattern.

Strong experience with at least one major cloud platform (AWS, Azure, or GCP) and its core data services.

Proven track record optimizing Spark performance and managing cluster cost.

Experience with data modeling, warehousing concepts, and building dimensional/analytics-ready datasets.

Proficiency with Git-based version control, CI/CD, and infrastructure-as-code.

Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.

Preferred Qualifications

Databricks certification (Data Engineer Associate/Professional).

Experience with Delta Live Tables, structured streaming, and real-time data processing.

Familiarity with MLflow and supporting machine learning workflows in production.

Experience with orchestration tools (Airflow, dbt) and data observability platforms.

Exposure to data governance, security, and compliance frameworks (e.g., GDPR, HIPAA, SOC 2).

Hands-on experience using AI coding assistants (e.g., Claude Code, GitHub Copilot, Cursor) to accelerate development, refactoring, and code review.

Familiarity with large language model APIs and SDKs (e.g., Anthropic Claude, OpenAI) and prompt engineering for data and analytics use cases.

Experience integrating GenAI capabilities into data pipelines or applications, including retrieval-augmented generation (RAG) and vector search.

Awareness of responsible AI practices, including evaluation, guardrails, and cost/latency trade-offs when deploying LLM-based solutions.

We invite you to stay connected with us by subscribing to our monthly job openings alert here.

Bounteous is proud to be an equal opportunity employer. Bounteous does not discriminate on the basis of race, religion, color, sex, gender identity, sexual orientation, age, physical or mental disability, national origin, veteran status, or any other status protected under federal, state, or local law. Bounteous is willing to sponsor eligible candidates for employment visas.

#BI-Remote

#LI-Remote