Director, Site Reliability Engineering
Why this role
As our Director of Site Reliability Engineering, reporting to our VP of Platform Engineering, you'll own the core infrastructure layers that everything at Doctolib runs on: cloud infrastructure, database operations, network infrastructure, and observability. You will also lead the Doctolib Operations Center (DOC) and drive a decisive shift from reactive operations to a proactive, world-class reliability culture.
This is a rare opportunity to shape the infrastructure backbone of Europe's leading healthtech company, at a moment when Doctolib is actively expanding multi-cloud capabilities, scaling to new countries, and building the reliability culture that will define the next decade of healthcare innovation.
Why this is an extraordinary challenge
-
Real stakes, every day. When Doctolib is down, consultations don't happen, diagnoses are delayed, care journeys are interrupted. The infrastructure you build is a direct lever on patient outcomes — in a world where 8 of the top 10 causes of death in Europe are preventable.
-
A once-in-a-generation platform transition. Multi-cloud, monolith modularisation, international expansion — all happening simultaneously. You won't inherit a finished platform. You'll define what it becomes.
-
Reliability as the competitive moat. As we scale AI health companions, automate clinical workflows, and launch across Europe, the speed and resilience of the platform directly determines how fast 700+ engineers can ship innovations that change healthcare.
-
A cultural build, not just a technical one. The incident response culture, observability standards, and operational ownership model you establish here will shape how Doctolib engineers work for years to come.
What you'll do
- Build and run a world-class SRE org of 25+ engineers across Cloud Infrastructure, Database & Storage, Network Infrastructure, Observability Tooling, and the Doctolib Operations Center
- Own the infrastructure strategy and roadmap — cloud, database, network, observability — and deliver against company OKRs
- Lead the Doctolib Operations Center: set incident response standards, drive MTTR reduction, embed blameless post-mortem culture across engineering
- Architect and execute our multi-cloud strategy — reducing vendor lock-in, cutting migration costs, and enabling international expansion
- Own network infrastructure at scale: load balancing, CDN/WAF, VPCs, peering, zero-trust networking across a high-traffic, multi-country platform
- Drive observability as a product — give 700+ engineers true visibility into system health and turn observability maturity into an operational excellence lever
- Lead from the front as a senior technical voice in the Platform org and broader Tech leadership team
Who you are
- 12+ years in software engineering, including 5+ years leading managers and running infrastructure or SRE organisations at scale
- Track record of taking SRE practices from reactive to proactive — with measurable reductions in incidents and MTTR
- Strong multi-cloud and network infrastructure experience: load balancing, CDN/WAF, VPCs, peering, at high-traffic scale
- Deep database operations background: large-scale transactional systems (PostgreSQL, Aurora), streaming/CDC (Kafka), data layer FinOps
- Experience building observability platforms that give teams genuine visibility — metrics, logs, traces, alerting
- Sharp process thinking: SLOs, error budgets, incident management, blameless post-mortems
- Outcome-driven: you track reliability, cost efficiency, and engineering velocity as business metrics, not just technical ones
- Strong communicator and influencer at executive level — equally credible with senior engineers and business stakeholders
- Builder of high-performing, people-first engineering cultures
- Fluent in English; comfortable in fast-paced, international environments
- You recognise yourself in our playbook values
Bonus Points If You Have…
- Experience in healthcare, regulated, or high-compliance industries (HDS, ISO 27001, SOC2, GDPR, data sovereignty)
- Familiarity with our stack: Ruby on Rails, Node.js, Go, Python, React, AWS, GCP, Kubernetes, PostgreSQL, Datadog, GitHub Actions
- French language proficiency
- Experience with AI-augmented infrastructure tooling or ML platform operations
- M&A or post-acquisition infrastructure integration experience
What we offer
- A Deutschlandticket (Germany-wide public transport pass) fully paid for by Doctolib
- 28 vacation days + 1 additional day for each full calendar year of employment (up to a maximum of 30 days)
- Work from abroad for up to 10 days per year thanks to our flexibility days policy
- Company health insurance with great supplementary benefits through our partner Allianz
- Company pension scheme (bAV) through Allianz with an employer subsidy
of 40% (15% within the probationary period) - Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth
- The Doctolib Parent Care program, which includes one month additional parental leave and much more
- Free mental health and coaching services through our partner Moka.care
- Subsidized sports membership through our partner Urban Sports Club
- A flexible workplace policy offering both hybrid and office-based mode
- Alongside healthy snacks and our regular breakfast buffet, we provide a subsidized meal benefit
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support