Databricks Architect
Project description
We are seeking a highly experienced Databricks Architect to lead the design, implementation, and optimization of enterprise-grade data platforms and AI-enabled workloads for a global food corporation. This role requires a hands-on technical leader who can translate business and data requirements into scalable Databricks-based solutions while ensuring robust governance, performance, and operational resilience. The ideal candidate will combine deep Databricks expertise with strong architecture leadership and the ability to operate within a managed service delivery model.
Responsibilities
- Lead the architecture, design, and implementation of Databricks-based data platforms, pipelines, and AI/ML workloads.
- Define scalable lakehouse patterns using core Databricks capabilities such as Apache Spark, Delta Lake, Databricks SQL, MLflow, and governance components.
- Design robust batch and streaming data pipelines for high-volume, business-critical workloads.
- Establish architecture standards for performance optimization, workload orchestration, reliability, observability, and cost control.
- Translate business and technical requirements into solution blueprints, reference architectures, and implementation roadmaps.
- Collaborate with data engineers, analysts, data scientists, DevOps teams, and business stakeholders to deliver end-to-end solutions.
- Ensure strong data governance, security, and compliance practices across the Databricks environment.
- Support AI and advanced analytics use cases by enabling reliable feature engineering, model lifecycle practices, and production-ready data foundations.
- Provide technical leadership during solution delivery, troubleshooting, and optimization of existing workloads.
- Act as a trusted advisor to client stakeholders, offering recommendations on architecture decisions, risks, trade-offs, and delivery priorities.
SKILLS
Must have
- Deep hands-on expertise with the Databricks platform, including architecture, workspace design, cluster strategy, jobs orchestration, and platform optimization.
- Strong command of Apache Spark and distributed data processing concepts, including performance tuning and optimization for complex data engineering workloads.
- Proven experience designing and delivering enterprise data solutions using Delta Lake and lakehouse architecture principles.
- Strong proficiency in Python and SQL; Scala is an advantage.
- Demonstrated capability in building and optimizing ETL/ELT pipelines, data models, and large-scale ingestion frameworks.
- Experience supporting AI/ML workloads on Databricks, including MLflow, model lifecycle considerations, and production-ready data preparation.
- Solid knowledge of cloud-native architecture patterns on AWS, Azure, or GCP, including storage, networking, identity, and security integration.
- Experience with data governance, access control, lineage, and compliance frameworks in enterprise environments.
- Familiarity with CI/CD, infrastructure-as-code, monitoring, and operational best practices for data platforms.
- Ability to engage in deep technical problem-solving and contribute immediately to complex delivery scenarios with minimal ramp-up time.
Nice to have
• Databricks certifications relevant to data engineering, machine learning, or platform architecture. • Experience in large enterprise or global delivery environments with complex stakeholder landscapes. • Background in consumer goods, manufacturing, supply chain, or similarly data-intensive industries. • Experience with real-time processing, orchestration frameworks, and integration with enterprise data ecosystems. • Strong communication skills with the ability to explain architecture decisions to both technical and non-technical stakeholders.