Machine Learning Compute Efficiency Lead, Infrastructure & Planning
- Own and support ML compute management for Apple’s inference workloads (GPU, TPU, and custom silicon) to enable large-scale model serving.
- Collaborate closely with Apple Intelligence and ML engineering teams to understand roadmaps and resource pain points to develop and implement resource strategies.
- Optimize Apple’s ML workloads by driving performance improvements, maximizing resource utilization, and reducing service costs through deep root cause analysis that shapes both engineering decisions and the end customer experience.
- Architect solutions for large-scale optimization problems, including capacity allocation, workload scheduling, and cost reduction, enabling Apple's AI-driven experiences.
- Advocate on behalf of Apple’s ML engineers to bring a consolidated view of ML platform and model inference requirements to Apple’s internal infrastructure platform providers and 3rd party public cloud providers.
Minimum Qualifications
BS in Computer Science, Computer Engineering, or equivalent practical experience
7+ years in ML infrastructure, systems architecture, or efficiency/optimization roles at scale
Strong conceptual understanding of foundation model inference/serving at scale and distributed training (data/tensor/pipeline parallelism), GPU/TPU utilization, memory hierarchies, and cluster scheduling
AI-fluent and capable of quickly adapting to AI workflows and empowerment
Proven track record of driving complex cross-org technical initiatives through influence, not authority
Strong analytical skills with experience designing or interpreting utilization analyses, capacity models, or efficiency metrics
Clear written and verbal communication, comfortable presenting to VPs and white-boarding with senior ML engineers
Preferred Qualifications
MS or PhD in a relevant field
Direct experience with foundation model serving, inference, and training at scale
Familiarity with PyTorch, JAX, cluster management (Slurm, Kubernetes), or GPU/TPU hardware
Prior experience in efficiency, FinOps, or capacity planning
Experience negotiating technical roadmaps with platform or infrastructure teams
Background in technical and financial decision-making (TCO modeling, cost optimization)