Member of Technical Staff - Distributed Systems
About Us
Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration platform that coordinates them.
The future of AI will require vastly more compute than exists today. But as AI workloads become more complex and new hardware architectures emerge, simply deploying more GPUs isn't enough. The challenge is making increasingly diverse compute work together.
Gimlet's platform intelligently partitions and routes workloads across heterogeneous hardware, enabling step-function improvements in performance and efficiency. Customers deploy through production-grade APIs without needing to think about hardware selection, placement, or optimization.
We work with foundation labs, hyperscalers, and AI-native companies to power production workloads at massive scale and help define the infrastructure layer for the future of AI.
About the role
At Gimlet, we believe every hire changes the company.
As a Series A company, talent density matters more than headcount. The engineers we hire today will shape the systems, culture, and standards that define Gimlet for years to come.
The future of AI infrastructure will not be built on a single hardware platform. It will be built on systems capable of coordinating increasingly heterogeneous compute at unprecedented scale.
This role is an opportunity to help build that future. We are not optimizing for headcount, we are optimizing for talent density.
You will design and operate the distributed systems that schedule, route, and coordinate AI workloads across thousands of nodes and diverse hardware architectures.
What success looks like
In the first 12-18 months, you will help:
Build scheduling and orchestration systems that coordinate workloads across heterogeneous hardware
Improve reliability and fault tolerance for production AI infrastructure operating at scale
Create APIS and control planes that simplify deployment for customers running mission-critical workloads
Influence the architecture of a platform that will power the next generation of AI systems
You may be a good fit if
Strong software engineering fundamentals
Experience building or operating distributed systems in production environments
Comfort reasoning about concurrency, failure modes, and tradeoffs in large-scale systems
Strong candidates may also have
Experience with Kubernetes or Kubernetes-adjacent systems beyond basic usage
Experience designing service-oriented architectures using RPC or asynchronous messaging
Familiarity with scheduling, queues, or resource management systems
Experience building reliable APIs and operating systems under high load
Software development experience in languages commonly used for systems development (e.g., Go, C++, Python)
What Makes Gimlet Different
Most AI Infrastructure companies are focused on deploying more compute. We are focused on making increasingly diverse compute work together.
We are not building another cloud platform. We are building the orchestration layer for the future of AI infrastructure. We believe the future of AI will require intelligent orchestration across many hardware architectures, datacenters, and execution environments.
The systems we build today will help define how AI workloads are deployed for the next decade.