LLM Ops Engineer

Who We Are

Healthcare needs a better rhythm: one that keeps care continuous and deeply human. Heidi is building an AI Care Partner that works alongside clinicians to make that possible.

We’re a team of doctors, engineers, designers, researchers, and creatives building tools that help clinicians stay focused on what matters most: their patients.

In just 18 months, Heidi has given back more than 18 million hours to healthcare professionals — supporting 73 million patient visits in 116 countries. Today, more than two million patient visits each week are powered by Heidi worldwide.

Backed by nearly $100 million in funding, we’re growing in the US, UK, Canada, and Europe, partnering with leading health systems including the NHS, Beth Israel Lahey Health, and Monash Health.

What you’ll do

LLM platform on Kubernetes

Design, deploy and maintain AWS/EKS infrastructure running GPU-backed model workloads
Manage GPU node pools, tune autoscaling for inference traffic patterns, and own the full model serving lifecycle from container builds to production rollouts
Write and maintain infrastructure as code in Terraform

Model evaluation and quality

Build tooling that measures whether models are performing — clinically accurate, latency-appropriate, cost-efficient
Design offline evaluation harnesses, automated regression tests, and dashboards that surface regressions before they reach clinicians

Cost and performance

Own GPU utilization, quantization, request batching, model routing, and spot/on-demand node strategy
Work closely with the Models Team on fine-tuning workflows and model selection tradeoffs

Cross-functional delivery

Collaborate with product engineers and clinicians on prompt engineering, context window management, and model data pipelines
Close the feedback loop between infrastructure quality and real-world clinical performance

Observability

Instrument token usage, latency P99s, GPU memory pressure, hallucination rates, and error classes
Define alerting thresholds and build self-serve model health tooling so the team is not relying on Slack threads to know something is wrong

What we’re looking for

Strong AWS and Kubernetes experience, with hands-on depth in EKS, GPU workload scheduling, and IAM patterns that do not cut corners on healthcare data requirements
Practical LLMOps experience: model serving frameworks (vLLM, TGI or similar), prompt versioning, model registry management, A/B deployment, shadow traffic, rollback strategies
Comfort with Python and enough ML context to hold a real conversation about fine-tuning, RLHF, RAG architectures, and evaluation methodology
Infrastructure-as-code fluency in Terraform
Experience building evaluation frameworks for generative models — not just accuracy metrics, but latency, cost, and output safety
A bias toward observable, auditable systems — especially in a regulated industry context
Strong engineering habits: small PRs, meaningful code review, test coverage, and a low tolerance for tech debt
Willingness to get close to the clinical domain: you do not need to be a clinician, but you need to care about the consequences of model failures in healthcare

Bonus

Experience working with HIPAA, the Australian Privacy Act, or similar healthcare data requirements
Prior work on latency-sensitive inference pipelines for consumer or clinical products
Familiarity with agent orchestration frameworks (LangChain, LangGraph, or similar)
Experience with Karpenter or cluster autoscaler tuning for bursty GPU workloads

The way we work

1. Build to Last

We design for safety and reliability so clinicians, patients, and our teams can trust what we build every day.

2. Own Your Practice

Ideas rise on merit, not title, and everyone shares responsibility for the standards we set together.

3. Move Fast, Stay Steady

We move quickly but never at the cost of trust. Progress only matters if people can depend on what we make.

4. Make Others Better

Honest feedback, steady support, and shared growth keep our teams improving together.

Why you will flourish with us

Flexible hybrid working environment, with 3 days in the office.
A generous personal development budget of $500 per annum
Learn from some of the best engineers and creatives, joining a diverse team
Become an owner, with shares (equity) in the company, if Heidi wins, we all win
The rare chance to create a global impact as you immerse yourself in one of Australia’s leading healthtech startups
If you have an impact quickly, the opportunity to fast track your startup career!

Heidi is dedicated to creating an equitable, inclusive, and supportive work environment that brings people together from diverse backgrounds, experiences, and perspectives. Our strength is in our differences. We're proud to be an equal opportunity employer and welcome all applicants as we're committed to promoting a culture of opportunity for all.

LLM Ops Engineer

Who We Are

What you’ll do

What we’re looking for

The way we work

Similar jobs

Forward Deployed Engineer

Forward Deployed Engineer

Senior Application Database Engineer

Software Engineer

Commercial Analyst

Senior Full Stack Software Engineer