Databricks Engineer- GCP Cloud

Wissen Technology

Databricks Engineer- GCP Cloud

Show

Requirements<\/h3>
Wissen Technology is Hiring for<\/b> <\/b><\/span>Databricks Engineer\- GCP Cloud<\/b><\/span><\/span> <\/b><\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
About Wissen Technology: <\/b><\/span><\/span><\/span><\/span> <\/b><\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
At Wissen Technology, we deliver niche, custom\-built products that solve complex business challenges across industries worldwide. Founded in 2015, our core philosophy is built around a strong product engineering mindset\u2014ensuring every solution is architected and delivered right the first time. Today, Wissen Technology has a global footprint with <\/span><\/span>2000+ employees<\/span> <\/span>across offices in the<\/span> <\/span>US, UK, UAE, India, <\/span><\/span>and <\/span><\/span>Australia<\/span>. Our commitment to excellence translates into delivering <\/span><\/span>2X<\/span> <\/span>impact compared to traditional service providers. How do we achieve this? Through a combination of deep domain knowledge, <\/span><\/span>cutting\-edge<\/span> <\/span>technology <\/span><\/span>expertise<\/span>, and a relentless focus on quality. We <\/span><\/span>don\u2019t<\/span> <\/span>just meet expectations\u2014we exceed them by ensuring faster time\-to\-market, reduced rework, and greater alignment with client <\/span><\/span>objectives<\/span>. We have a proven <\/span><\/span>track record<\/span> <\/span>of building mission\-critical systems across industries, including financial services, healthcare, retail, manufacturing, and more. Wissen stands apart through its unique delivery models. Our outcome\-based projects ensure predictable costs and timelines, while our agile pods <\/span><\/span>provide<\/span> <\/span><\/span>clients with<\/span> <\/span>the flexibility to adapt to their evolving business needs. Wissen leverages its thought leadership and technology prowess to drive superior business outcomes. Our success is powered by top\-tier t<\/span>ale<\/span>nt<\/span>. <\/span><\/span>Our mission is clear: to be the partner of choice for building world\-class custom products that deliver exceptional impact\u2014the first time, every time.<\/span><\/span><\/span><\/span> <\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
Job Summary<\/b><\/span>:<\/b><\/span> <\/b><\/span><\/span><\/span><\/span>We are looking for a bright and dynamic engineer, motivated and able to work independently as well as in partnership with IT and Business <\/span><\/span>teams spread<\/span> <\/span>across the globe. The candidate needs to be an exceptionally strong Python and SQL programmer with hands\-on experience in GCP\-native data technologies including <\/span><\/span>BigQuery<\/span>, <\/span><\/span>Dataproc<\/span>, Cloud Composer, and <\/span><\/span>Datastream<\/span>.<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/span><\/p><\/div>
Besides technical skills, we are looking for a candidate with a strong sense of ownership and the ability to work in a diverse, cross\-functional team spanning Engineering, Research, DataOps, and Compliance.<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p>
Experience:<\/b> 6<\/span><\/span><\/span>\-<\/span> <\/span><\/span>8<\/span> <\/span>y<\/span><\/span><\/span>ear<\/span>s<\/span><\/span><\/span><\/span> <\/span><\/span><\/span>
<\/span><\/div><\/div>
Location:<\/b><\/span><\/span><\/span> <\/b><\/span><\/span><\/span><\/span> <\/b><\/span><\/span>Mumbai/Pu<\/span>ne/<\/span>Bangalore<\/span><\/span><\/span><\/span> <\/span><\/span><\/span>
<\/span><\/div>
Mode of Work<\/b><\/span>:<\/b><\/span> <\/b><\/span><\/span><\/span><\/span>Full time<\/span> <\/span><\/span><\/span> <\/span><\/span>
<\/span><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span>
<\/span><\/span><\/p><\/div>
Key<\/b> <\/b><\/span><\/span>Responsibilities:<\/b><\/span><\/span><\/span> <\/b><\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span>
<\/span><\/span><\/p>
Build and <\/span>maintain <\/span>scalable, distributed, fault\-tolerant data pipelines on GCP, including BigQuery\-based lakehouse layers and <\/span><\/span>Dataproc<\/span>\-driven Delta Lake workflows<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Actively <\/span>participate <\/span>in meetings with various stakeholders across data engineering, compliance, and business teams globally<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Understand market data processing and transformation needs; build pipelines to <\/span>acquire, <\/span><\/span>normalise<\/span>, transform, and release large volumes of financial data through the OMDP data factory<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Design and implement bitemporal data models (valid\-time + system\-time) on <\/span><\/span>BigQuery<\/span> <\/span>to support certified, regulatory\-grade time\-series datasets<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Build, use, and <\/span>maintain <\/span>software testing frameworks (unit / non\-regression / user acceptance) for data pipelines and transformation logic<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Take complete ownership of solutions and assigned tasks, including ingestion pipelines, QA workflows, correction management, and audit trail implementation.<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Work in a collaborative manner with other team members and contribute to shared platform services rather than vertical\-specific implementations<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Have business acumen to understand financial concepts around reference data related to equities and other asset classes<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Support teams across data and technology in implementing AI solutions and integrating their services with MSCI's data science products and platforms, including AI\-assisted ingestion, anomaly detection, and semantic search over the <\/span><\/span>lakehouse<\/span> <\/span>using Vertex AI<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/span>
<\/span><\/li><\/ul>
<\/span><\/span>
<\/span><\/span><\/p><\/div>
<\/span><\/span>
<\/span><\/span><\/p><\/div>
Requirements:<\/b><\/span><\/span><\/span> <\/b><\/span><\/span>
<\/span><\/span><\/p>
6\-8 years of experience in data engineering<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Proficient in Python programming \u2014 data pipeline development, transformation logic, and automation scripts<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Proficient in data query and analysis using SQL, with strong hands\-on experience in <\/span><\/span>BigQuery<\/span> <\/span>\u2014 partitioning, clustering, <\/span><\/span>materialised<\/span> <\/span>views, and time\-series query patterns at scale<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Hands\-on experience building and scheduling pipelines using Cloud Composer (Apache Airflow) \u2014 DAG authoring, SLA alerting, retry logic, and dependency management<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Working knowledge of <\/span><\/span>Dataproc<\/span> <\/span>(Apache Spark) \u2014 batch ingestion, Delta Lake merge operations, and incremental data processing<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Proficient in AI\-assisted development tools such as GitHub Copilot, Cursor, or others for accelerating code generation and enhancing developer productivity<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Code versioning and collaboration using Git \u2014 branching strategies, pull request workflows, and pipeline\-as\-code practices<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Familiarity with REST APIs \u2014 consuming external data vendor APIs and building service\-layer integrations<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Familiarity with GCP cloud technologies \u2014 Cloud Storage, Pub/Sub, <\/span><\/span>Datastream<\/span>, Cloud Monitoring, IAM, and VPC Service Controls<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/span>
<\/span><\/li><\/ul>
<\/span><\/span>
<\/span><\/span><\/p><\/div>
Good To Have Sk<\/b><\/span>ills:<\/b><\/span> <\/b><\/span><\/span><\/span><\/span> <\/b><\/span><\/span><\/span>
<\/span><\/span><\/p>
Basic knowledge of data manipulation and analysis libraries \u2014 pandas, <\/span><\/span>PySpark<\/span>, or equivalent<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Basic knowledge of columnar storage, SQL\-based querying, and time\-series analytics (<\/span>ClickHouse<\/span> <\/span>or equivalent)<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Familiarity with <\/span><\/span>Dataplex<\/span> <\/span>for data discovery, lineage, policy tagging, and data quality rule management<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Understanding of Change Data Capture (CDC) patterns using <\/span><\/span>Datastream<\/span> <\/span>for replicating transactional data into <\/span><\/span>BigQuery<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Understanding of bitemporal data <\/span><\/span>modeling<\/span> <\/span>concepts (valid\-time and system\-time) and the challenges of implementing them within <\/span><\/span>BigQuery's<\/span> <\/span>append\-optimised design<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Understanding of financial reference data \u2014 equities, fixed income identifiers, corporate actions, or index composition data<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Familiarity with <\/span><\/span>BigQuery<\/span> <\/span>cost management \u2014 slot reservations, query cost controls, and workload isolation using reservations and assignments<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Exposure to CI/CD pipelines and infrastructure\-as\-code using Terraform for data platform deployments on GCP<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/li>
Prior experience or projects involving LLMs and Agentic AI \u2014 particularly using Vertex AI for AI\-assisted data quality, anomaly detection, semantic search, or natural language querying over structured datasets \u2014 is a strong plus<\/span><\/span><\/span> <\/span><\/span>
<\/span><\/span>
<\/span><\/li><\/ul><\/div><\/span>