Data Engineer (Databricks)
Who you are
You are a Databricks-focused Data Engineer who understands that great data platforms are only as valuable as the products, AI workflows, and experiences they enable. You bring deep, production-grade expertise across the Databricks platform and know how to connect platform capabilities to real business outcomes.
You thrive in ambiguity and can quickly assess a client's data landscape to recommend and implement the right solutions. You understand that in consulting, your Databricks depth is most valuable when it connects platform capabilities to the products and experiences clients actually use — and you're as comfortable in a product design conversation as you are building a DLT pipeline.
You excel at translating complex data challenges into clear technical requirements and can confidently navigate conversations with everyone from data scientists to executives. Your engineering principles are mature and grounded in real-world experience across various industries and scales.
You have an interest in and a curiosity about data platforms and the latest advances in data technology.
What you will be doing
- Design and build production data pipelines using Lakeflow Declarative Pipelines, Autoloader, and Structured Streaming, with end-to-end ownership of ingestion, transformation, data quality expectations, and CI/CD deployment via Declarative Automation Bundles.
- Architect and implement Lakehouse solutions on Databricks — medallion architecture, Delta Lake, Unity Catalog — tailored to the client's analytics, AI, and application needs.
- Build and maintain Databricks transformation layers — DLT pipelines, PySpark notebooks, and dbt — with data quality constraints and SLAs baked in.
- Design and maintain the data and AI foundations — Unity Catalog, Feature Store, MLflow, and Model Serving — that power production ML, agent workflows, and AI-enabled digital products.
- Collaborate with product and backend engineers to design data models, APIs, and application data contracts — ensuring the platform serves the product, not just the warehouse.
- Consult with clients to understand their data challenges, develop data strategies, and implement sustainable solutions.
- Adapt your approach based on project needs — sometimes leading data architecture discussions with clients, other times supporting internal teams with specialized data expertise.
- Work within multi-cloud environments — primarily AWS and Azure — anchoring data platform recommendations around Databricks where it fits the client's architecture and goals.
- Champion data governance through Unity Catalog — access control, lineage, data quality policies, and compliance — as a first-class part of every engagement, not an afterthought.
- Design data-to-application architectures — including Lakebase-backed services and Databricks Apps — that connect governed data to AI workflows, digital products, and user-facing experiences.
- Help build Livefront's Databricks practice — contributing to accelerators, internal enablement, certification goals, and Databricks partner go-to-market materials alongside delivery work.
Why you should apply
- You want to work with passionate and talented people who are always looking for ways to make things better.
- You desire a work environment where respect, mutual trust, and egoless collaboration are paramount.
- You want colleagues who take their work seriously but not themselves, and who know how to let loose and have a good time.
- You like being part of a team with a reputation for excellence that gives back to the community by educating, mentoring, and sponsoring.
- You want to work on products and accounts that have outsized impact and reach.
- You believe in sweating the details, giving a damn about quality, and taking pride in going the extra mile.
- You want to help build a data practice specialization from the ground up — shaping how we go to market with Databricks, what we build as accelerators, and what it means to do this kind of work at a digital product company.
What you bring to the table
- 3-5 years of data engineering experience with at least 2 years in production Databricks environments, preferably in a consulting or client delivery context.
- Solid working knowledge of AWS and Azure cloud services relevant to Databricks deployments — storage, networking, IAM, and compute — with GCP familiarity a plus.
- Deep, production-grade Databricks expertise: Lakeflow Declarative Pipelines, Autoloader, Structured Streaming, Lakeflow Jobs, Unity Catalog (including fine-grained access control and lineage) — demonstrated through shipped production workloads, not prototypes.
- Proven experience designing Lakehouse architectures — medallion patterns, Delta Lake table design, partitioning, Z-ordering, and query optimization — at production scale.
- Hands-on experience with data pipeline testing, observability, and CI/CD for data — including unit testing, data quality frameworks, and version-controlled deployments via Git and Declarative Automation Bundles.
- Strong proficiency in SQL and Python, with the ability to write clean, performant, and maintainable code.
- Understanding of data modeling, schema design, and query optimization.
- Excellent communication skills with the ability to explain complex data concepts to both technical and non-technical stakeholders.
- Strong problem-solving skills with the ability to navigate ambiguous requirements and deliver pragmatic solutions.
- Above-average discipline and personal organization skills.
- Obvious comfort with critique and peer review in the context of an iterative development process.
- A demonstrated hunger for personal and professional growth.
- A self-evident love and care for the craft of data engineering.
Bonus points if you…
- Have worked with real-time streaming technologies (Kafka, Kinesis, etc.).
- Have hands-on experience with alternative cloud data platforms — useful context for migrations and competitive assessments, though Databricks is our primary platform focus.
- Have experience in healthcare or fintech domains.
- Have hands-on experience with MLOps or LLMOps on Databricks — MLflow experiment tracking, model registry, Model Serving endpoints, or Vector Search for RAG pipelines.
- Have experience with Java, Go, or Scala.
- Have strong illustration skills for technical diagramming and data architecture documentation.
- Speak, write, and/or educate publicly about data engineering topics.
- Have contributed to open-source data projects.
- Hold or are actively pursuing a Databricks certification (Data Engineer Associate or Professional, or Apache Spark Developer) — we treat these as meaningful signals of platform depth, and they directly support our Databricks partner growth goals.
- Have experience with Databricks Apps, or Lakebase — early familiarity with where the Databricks platform is heading is a strong differentiator.