Senior Data Engineer
You will build and operate streaming and batch data pipelines that ingest, normalise, and distribute market, trading, and portfolio data. You will design the lakehouse and time-series layers around consumer query patterns, own data contracts and schema evolution, and implement data quality, lineage, and self-healing. You will provide self-serve tooling, instrument observability, treat infrastructure as code, and work openly with architecture, infrastructure, platform, and product stakeholders. You will produce derived analytics such as cross-exchange spreads, VWAP, order book microstructure, and portfolio/performance views.
Responsibilities
- Build streaming and batch pipelines that ingest, normalise, and distribute market, trading, and portfolio data resilient to feed and exchange failures
- Build self-serve tooling (SDKs, patterns, templates, AI agents) for publishing and consuming data products
- Own data contracts and manage schema evolution
- Design the lakehouse and time-series layer around consumer query patterns
- Build and evolve data governance and data quality frameworks including stale-feed detection, schema validation, range checks, idempotent writes, lineage, and ownership
- Build derived analytics such as cross-exchange spreads, VWAP at depth, order book microstructure, portfolio views, exposure, and performance
- Make observability, cost, and performance first-class
- Treat infrastructure as code (Docker, Terraform, CI/CD)
- Write documentation and partner closely with Architecture, Infrastructure, Platform, and other teams
Requirements
- 8+ years of building production data systems
- Strong proficiency in Python
- Strong proficiency in SQL and reasoning about query engines
- Strong understanding of data modelling for streaming and analytical workloads
- Experience designing and operating streaming systems (Kafka, Redpanda, MSK, or Kinesis)
- Experience with time-series stores in production (ClickHouse, TimescaleDB, QuestDB, or similar)
- Experience with lakehouse architectures and table layout, partitioning, and compaction decisions
- Experience building for idempotency and self-healing with safe reprocessing
- Experience with Docker, Terraform, and CI/CD
- Experience instrumenting logs, metrics, and traces for observability
- Experience designing data quality, governance, contracts, validation, lineage, and ownership
- Understanding of financial market data (order books, trades, reference data, portfolios, exposures)
- Ability to design, ship, operate, and improve end-to-end data systems
- Nice to have: Lakehouse experience with Apache Iceberg or Delta Lake
- Nice to have: Familiarity with DataHub or similar metadata/lineage platforms
- Nice to have: Rust familiarity
Benefits
- Flexible hours
- Remote-first
- Business-hours on-call shared across the team
- Regular online get-togethers
- Yearly onsite
- Autonomy on how you work
- Strong cross-functional partners