ML Researcher, Apple Foundation Models
We are building the next generation of models optimized for Agentic, Reasoning, and Coding capabilities. This means training models via RL to reason from first principles, building autonomous coding agents that operate in real repositories, and developing agentic systems that handle multi-step workflows with error recovery. You will work on problems like: RL with verifiable rewards for mathematical reasoning, multi-turn RL for coding agents evaluated on SWE-Bench and beyond, scaling laws for RL compute allocation, progressive alignment across capability stages, and training models to manage their own context in long-horizon tasks. This is applied research with direct product impact — your work will ship to millions of users.
Minimum Qualifications
Demonstrated expertise in deep learning with publications at top ML or NLP conferences, or a track record of applying deep learning techniques to products
Proficient programming skills in Python and one of the deep learning toolkits such as JAX, PyTorch, or Tensorflow
Ability to work in a collaborative environment.
PhD, or equivalent practical experience, in Computer Science, or related technical field.
Preferred Qualifications
Reinforcement learning for LLMs: RLHF, GRPO, PPO, RLVR, reward modeling, RL scaling laws
Code generation and coding agents: repository-level code understanding, agentic coding
Agentic systems: multi-turn RL, tool-use planning, long-horizon task execution, user simulation
Distillation and alignment: on-policy distillation, reward-tilted distillation, cross-stage distillation to combine independently optimized capabilities into a single model
Long context and efficiency: sparse attention, context compression, scaling to very long context windows