Senior Data Engineer

You will build and operate streaming and batch data pipelines that ingest, normalise, and distribute market, trading, and portfolio data. You will design the lakehouse and time-series layers around consumer query patterns, own data contracts and schema evolution, and implement data quality, lineage, and self-healing. You will provide self-serve tooling, instrument observability, treat infrastructure as code, and work openly with architecture, infrastructure, platform, and product stakeholders. You will produce derived analytics such as cross-exchange spreads, VWAP, order book microstructure, and portfolio/performance views.

Responsibilities

Build streaming and batch pipelines that ingest, normalise, and distribute market, trading, and portfolio data resilient to feed and exchange failures
Build self-serve tooling (SDKs, patterns, templates, AI agents) for publishing and consuming data products
Own data contracts and manage schema evolution
Design the lakehouse and time-series layer around consumer query patterns
Build and evolve data governance and data quality frameworks including stale-feed detection, schema validation, range checks, idempotent writes, lineage, and ownership
Build derived analytics such as cross-exchange spreads, VWAP at depth, order book microstructure, portfolio views, exposure, and performance
Make observability, cost, and performance first-class
Treat infrastructure as code (Docker, Terraform, CI/CD)
Write documentation and partner closely with Architecture, Infrastructure, Platform, and other teams

Requirements

8+ years of building production data systems
Strong proficiency in Python
Strong proficiency in SQL and reasoning about query engines
Strong understanding of data modelling for streaming and analytical workloads
Experience designing and operating streaming systems (Kafka, Redpanda, MSK, or Kinesis)
Experience with time-series stores in production (ClickHouse, TimescaleDB, QuestDB, or similar)
Experience with lakehouse architectures and table layout, partitioning, and compaction decisions
Experience building for idempotency and self-healing with safe reprocessing
Experience with Docker, Terraform, and CI/CD
Experience instrumenting logs, metrics, and traces for observability
Experience designing data quality, governance, contracts, validation, lineage, and ownership
Understanding of financial market data (order books, trades, reference data, portfolios, exposures)
Ability to design, ship, operate, and improve end-to-end data systems
Nice to have: Lakehouse experience with Apache Iceberg or Delta Lake
Nice to have: Familiarity with DataHub or similar metadata/lineage platforms
Nice to have: Rust familiarity

Benefits

Flexible hours
Remote-first
Business-hours on-call shared across the team
Regular online get-togethers
Yearly onsite
Autonomy on how you work
Strong cross-functional partners