Machine Learning Compute Efficiency Lead, Infrastructure & Planning

- Own and support ML compute management for Apple’s inference workloads (GPU, TPU, and custom silicon) to enable large-scale model serving. - Collaborate closely with Apple Intelligence and ML engineering teams to understand roadmaps and resource pain points to develop and implement resource strategies. - Optimize Apple’s ML workloads by driving performance improvements, maximizing resource utilization, and reducing service costs through deep root cause analysis that shapes both engineering decisions and the end customer experience. - Architect solutions for large-scale optimization problems, including capacity allocation, workload scheduling, and cost reduction, enabling Apple's AI-driven experiences. - Advocate on behalf of Apple’s ML engineers to bring a consolidated view of ML platform and model inference requirements to Apple’s internal infrastructure platform providers and 3rd party public cloud providers. Minimum Qualifications BS in Computer Science, Computer Engineering, or equivalent practical experience 7+ years in ML infrastructure, systems architecture, or efficiency/optimization roles at scale Strong conceptual understanding of foundation model inference/serving at scale and distributed training (data/tensor/pipeline parallelism), GPU/TPU utilization, memory hierarchies, and cluster scheduling AI-fluent and capable of quickly adapting to AI workflows and empowerment Proven track record of driving complex cross-org technical initiatives through influence, not authority Strong analytical skills with experience designing or interpreting utilization analyses, capacity models, or efficiency metrics Clear written and verbal communication, comfortable presenting to VPs and white-boarding with senior ML engineers Preferred Qualifications MS or PhD in a relevant field Direct experience with foundation model serving, inference, and training at scale Familiarity with PyTorch, JAX, cluster management (Slurm, Kubernetes), or GPU/TPU hardware Prior experience in efficiency, FinOps, or capacity planning Experience negotiating technical roadmaps with platform or infrastructure teams Background in technical and financial decision-making (TCO modeling, cost optimization)

Machine Learning Compute Efficiency Lead, Infrastructure & Planning

Similar jobs

ML Compute Efficiency Automation Engineer, Infrastructure & Planning

Full Stack Software Engineer - ML Compute Capacity

Staff/Sr. ML Compute Efficiency Engineer

Machine Learning Architect, Platform Architecture

Machine Learning Test and Automation Engineer, Graphics, Games, and ML

Machine Learning Data Scientist