Senior Observability Engineer

About Us:

Rent the Runway (RTR) is transforming the way we get dressed by pioneering the world’s first Closet in the Cloud. Founded in 2009, RTR has disrupted the $2.4 trillion fashion industry by inspiring women with a more joyful, sustainable and financially-savvy way to feel their best every day. As the ultimate destination for circular fashion, the brand now offers infinite points of access to its shared closet via a fully customizable subscription to fashion, one-time rental or ownership. RTR offers designer apparel and accessories from hundreds of brand partners and has built in-house proprietary technology and a one-of-a-kind reverse logistics operation. RTR has been named to CNBC’s “Disruptor 50” five times in ten years, and has been placed on Fast Company’s Most Innovative Companies list multiple times.

Galway Office:

Rent The Runway established its European Technology Hub in Galway in April 2019. Based in the historic Claddagh area of the city, the growing team in Galway tackles core technology challenges and influences the next generation of services critical to Rent The Runway’s success and continued growth.

The Galway office is Rent the Runway's first international office outside the US and enables the company to significantly expand its Software Engineering, Product Development, Machine Learning Engineering and Data Science footprint. Rent The Runway’s Galway-based employees have the opportunity to grow their careers across several roles and career paths in Technology.

Our engineering team is smart, pragmatic, and entrepreneurial. We practice continuous integration & test-driven development, engage in constant peer code reviews & pair programming, and work hard to give back to the software community through open-source contributions.

About the Role:

We’re looking for a Senior Observability Engineer to lead the development and scaling of telemetry systems that keep our platforms reliable, performant, and resilient. You’ll play a critical role in shaping observability practices across engineering and infrastructure teams, enabling better incident response, deeper system insight, and stronger delivery outcomes.

You will define how we measure, detect, and respond to what matters, empowering teams to build and operate services with confidence. As a senior contributor, you will own key initiatives, drive adoption of observability standards, and collaborate across the company to enable scalable, efficient engineering.

What You’ll Do:

  • Lead the architecture, delivery, and continuous improvement of observability solutions, leveraging tools like Splunk Observability Cloud, Google Cloud Observability or equivalent platforms.
  • Build scalable, automated telemetry pipelines using Terraform and modern Infrastructure-as-Code (IaC) workflows to support auditability, reliability, and self-service adoption.
  • Define and evolve best practices for metrics, traces, logs, and events to ensure high-signal alerting and consistent, actionable instrumentation across our systems.
  • Collaborate with application, platform, security, and compliance teams to integrate observability into every phase of the development lifecycle, from instrumentation patterns to SLOs to post-incident reviews.
  • Drive the definition and adoption of internal standards for service-level indicators (SLIs), error budgets, and system health across services.
  • Lead adoption of modern AI-assisted development and debugging workflows within observability tooling and automation, helping teams accelerate incident response, system instrumentation, and root cause analysis.
  • Identify and resolve cross-functional observability challenges through simplification, standardisation, and durable solution design.
  • Participate in SRE/Platform on-call rotations to stay connected to production challenges and continuously improve observability tooling, alert design, and incident response.
  • Provide technical guidance to engineers across teams, helping them apply telemetry principles to improve system insight.
  • Champion a culture of reliability through enablement, building reusable frameworks, shared documentation, and team-level training that improves observability without increasing toil.

About You:

  • 5+ years of experience in SRE, DevOps, or platform engineering roles, with a deep specialisation in observability systems and telemetry design.
  • Recognised as a subject-matter expert in observability tools and practices, especially in cloud-native, distributed systems environments.
  • Proven track record of delivering high-impact technical projects that improve system visibility, reduce operational risk, or enhance incident response at scale.
  • Deep knowledge of metrics, logs, traces, and events, and how to apply them to both product services and infrastructure components.
  • Hands-on experience managing telemetry pipelines with Terraform, CI/CD tooling, and service instrumentation in Kubernetes-based environments.
  • Strong understanding of system-level design principles, including service mesh architectures, asynchronous message flows, and distributed systems behaviour.
  • Adept at communicating across audiences, able to translate complex technical ideas into clear, actionable insights for both engineers and stakeholders.
  • Operates with ownership and curiosity, proactively identifies failure modes, simplifies complex systems, and drives structured root cause analysis.
  • A natural collaborator who fosters alignment, builds consensus, and leads through influence across technical and non-technical domains.

Benefits:

At Rent the Runway, we’re committed to the happiness and well-being of our employees, and aim to create a workplace that fosters both personal and professional growth. Our inclusive benefits include, but are not limited to:

  • Generous Paid Time Off, including annual leave, paid bereavement, and family sick leave - every employee needs time to take care of themselves and their family.
  • Universal Paid Parental Leave for both parents + flexible return to work program - because we know your newest family member(s) deserve your undivided attention.
  • Paid Sabbatical after 5 years of continuous service - unplug, recharge, and have some fun.
  • Competitive Stakeholder Pension - taking care of your future.
  • Comprehensive health, dental care and dependents care from day 1 of employment - Your health comes first, and we’ve got you covered.
  • Company-wide events and outings - our team spirit is no joke - we know how to have fun!
  • Hybrid Work -This hybrid role requires 2-3 days per week in our Galway, Ireland office, with the option to work 2 days remotely.

Rent the Runway is an equal opportunity employer. In accordance with applicable law, we prohibit discrimination against any applicant or employee on any legally recognised basis, including, but not limited to: gender, marital status, family status, age, disability, sexual orientation, race, religion, and membership of the Traveller community.

#LI-EM1

By submitting your application below, you agree that you have read and acknowledge Rent the Runway's Candidate Privacy Policy, found here.