Senior Platform Engineer

Are you a talented and driven distributed systems engineer, with a proven track record, capable of leading cross-organizational features? Does tech debt quiver in its boots at the sound of your name? Do you butter your bread with evented systems? Have the letters C, Q, R, and S been ruined for you in an eventually consistent manner? Have we got news for you: you’re not alone!

In this role, you’ll be a key contributor to our Platform Engineering team. You'll get the chance to hone your skills alongside some of the best Platform Engineers (think SRE meets Ops with a heavy focus on automation of everything). Our team’s current focus is on Service Delivery (Kubernetes), Infrastructure Management (Terraform), CI/CD (Argo, Atlantis), and in general all things SLOs and Operational Excellence. Your teammates are talented folks who value code that is 80% of the value for 20% of the work, designs that are forward-thinking enough to be easily flexible for the next features, and leadership with a healthy dose of mindfulness and humility.

What you’ll do

You’ll become a key contributor to the team, taking responsibility for the success of some of our subsystems.
You will be participating on medium to large impact team initiatives, and within a year be able to execute on such projects.
You’ll help with interviewing potential teammates.
You’ll create technical designs that proactively address cost efficiency, security, and observability.
You’ll deliver technical plans, one-pagers, DRs, and other artifacts.
You’ll work with Kubernetes, GCP, Helm, Terraform, DataDog, ArgoCD, CircleCI, Atlantis, Docker (the list goes on) to deliver your work.
You’ll be responsible for improving developer velocity across the company (leveraging frameworks like DORA) and hardening our reliability and observability.
You’ll participate in on-call rotations and help keep all of Apollo afloat
- You’ll be fully empowered to fix the root cause of issues, and ruthlessly quell any noisy monitors

Who you are

A description of our ideal candidate.

Minimum requirements

You have systems expertise and experience with stateless/fault tolerant systems, as well as familiarity with eventing patterns and distributed paradigms.
You’ve leveraged agentic tooling to not only enhance your day to day work, but detect and mitigate issues in production
You think about weighing technical and business trade-offs and are working at “seeing down the road and around corners.”
You enjoy cross-team collaboration and believe in a “rising tides lifts all boats” mentality. You’re a joy to those around you, bringing everyone along for the ride.
You are data-oriented and have leveraged data to make decisions in a concrete way.
You have familiarity with Kubernetes, GCP (preferably, but AWS or Azure is great too!), Terraform

Nice to have

Bonus points for Helm, Docker, ArgoCD, CircleCI, DataDog, and of course GraphQL!