Senior ML Infrastructure Engineer - Training Algorithms, SIML
In this role you will be technically hands on, with deep subject matter expertise in ML infrastructure.
Responsibilities Include:
- Training optimizations & profiling targeting vision/language pre-training
- Researching training recipes for effective scheduling of multimodal training workloads
- Experimentation & tooling for post-training ablations including reward modeling, distillation and prompt optimization
- Coordinating with post-training algorithm owners for analyzing quality / performance tradeoffs of downstream capabilities
- Ablations involving optimization aware fine-tuning
Minimum Qualifications
Bachelors, Masters, or PhD in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on machine learning; or comparable professional experience
Experienced in training / adapting LLM and Diffusion models
Advanced Fluency in PyTorch
Excellent programming skills and experience contributing software to large projects
Experience with distributed training of large models
Preferred Qualifications
Strong ML Fundamentals
Experience working with large cross-functional and diverse teams.