Senior Data Engineer

You will build and operate streaming and batch data pipelines that ingest, normalise, and distribute market, trading, and portfolio data. You will design the lakehouse and time-series layers around consumer query patterns, own data contracts and schema evolution, and implement data quality, lineage, and self-healing. You will provide self-serve tooling, instrument observability, treat infrastructure as code, and work openly with architecture, infrastructure, platform, and product stakeholders. You will produce derived analytics such as cross-exchange spreads, VWAP, order book microstructure, and portfolio/performance views.

Responsibilities

  • Build streaming and batch pipelines that ingest, normalise, and distribute market, trading, and portfolio data resilient to feed and exchange failures
  • Build self-serve tooling (SDKs, patterns, templates, AI agents) for publishing and consuming data products
  • Own data contracts and manage schema evolution
  • Design the lakehouse and time-series layer around consumer query patterns
  • Build and evolve data governance and data quality frameworks including stale-feed detection, schema validation, range checks, idempotent writes, lineage, and ownership
  • Build derived analytics such as cross-exchange spreads, VWAP at depth, order book microstructure, portfolio views, exposure, and performance
  • Make observability, cost, and performance first-class
  • Treat infrastructure as code (Docker, Terraform, CI/CD)
  • Write documentation and partner closely with Architecture, Infrastructure, Platform, and other teams

Requirements

  • 8+ years of building production data systems
  • Strong proficiency in Python
  • Strong proficiency in SQL and reasoning about query engines
  • Strong understanding of data modelling for streaming and analytical workloads
  • Experience designing and operating streaming systems (Kafka, Redpanda, MSK, or Kinesis)
  • Experience with time-series stores in production (ClickHouse, TimescaleDB, QuestDB, or similar)
  • Experience with lakehouse architectures and table layout, partitioning, and compaction decisions
  • Experience building for idempotency and self-healing with safe reprocessing
  • Experience with Docker, Terraform, and CI/CD
  • Experience instrumenting logs, metrics, and traces for observability
  • Experience designing data quality, governance, contracts, validation, lineage, and ownership
  • Understanding of financial market data (order books, trades, reference data, portfolios, exposures)
  • Ability to design, ship, operate, and improve end-to-end data systems
  • Nice to have: Lakehouse experience with Apache Iceberg or Delta Lake
  • Nice to have: Familiarity with DataHub or similar metadata/lineage platforms
  • Nice to have: Rust familiarity

Benefits

  • Flexible hours
  • Remote-first
  • Business-hours on-call shared across the team
  • Regular online get-togethers
  • Yearly onsite
  • Autonomy on how you work
  • Strong cross-functional partners