Apple Silicon GPU Driver Engineer, Graphics, Game and ML
The Apple Silicon GPU Driver Scheduler team is directly responsible for GPU workload management including scheduling of commands on the GPU, manage resources and dependencies, responsiveness and quality of service for applications using the GPU. The GPU Scheduler team directly impacts the performance and power efficiency of all Apple products using Apple Silicon GPU. We are looking for an engineer with a strong engineering background who is excited to work with engineers and other leaders at Apple to deliver Apple GPUs across all Apple devices, build and ship exciting new GPU focused features, work with other teams to prototype future HW and SW GPU features.
In this role, you'll architect the GPU driver scheduling layer underneath Apple's largest server-side ML and LLM workloads. You’ll design parallelism strategies that scale from a single GPU to clusters of nodes, build the synchronization and communication primitives that hold them together, and shape the HW/SW interfaces for next-generation GPU designs. You will be working at the intersection of cutting-edge ML systems, systems programming and hardware acceleration, partnering with world-class teams across Apple software and hardware organizations to co-design scheduling primitives in next-generation GPU, collaborate with framework and infrastructure teams to expose scheduling control where it matters, and contribute to the performance and reliability characteristics that ultimately determine inference latency and cost.
We are seeking an individual with curiosity and passion to learn and innovate.
The people here at Apple don’t just create products — they create the kind of wonder that’s revolutionized entire industries. It’s the diversity of those people and their ideas that inspires the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Join Apple, and help us leave the world better than we found it.
Minimum Qualifications
Technical BS/MS degree or equivalent experience
Excellent systems programming knowledge with C or C++
Strong experience with operating systems and/or scheduling policies knowledge
Experience or deep understanding of distributed systems and parallel computing architectures
Understanding of systems architecture/compilers/algorithms
Excellent written and oral communication skills
Preferred Qualifications
Experience with GPU Programming (CUDA/ROCm/Metal) and high-performance computing, successfully optimizing large-scale parallel workloads
Experience with inter-node communication technologies (InfiniBand, RDMA, NCCL) in the context of ML training/inference