Senior ML Infrastructure Engineer - Training Algorithms, SIML

In this role you will be technically hands on, with deep subject matter expertise in ML infrastructure. Responsibilities Include: - Training optimizations & profiling targeting vision/language pre-training - Researching training recipes for effective scheduling of multimodal training workloads - Experimentation & tooling for post-training ablations including reward modeling, distillation and prompt optimization - Coordinating with post-training algorithm owners for analyzing quality / performance tradeoffs of downstream capabilities - Ablations involving optimization aware fine-tuning Minimum Qualifications Bachelors, Masters, or PhD in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on machine learning; or comparable professional experience Experienced in training / adapting LLM and Diffusion models Advanced Fluency in PyTorch Excellent programming skills and experience contributing software to large projects Experience with distributed training of large models Preferred Qualifications Strong ML Fundamentals Experience working with large cross-functional and diverse teams.

Similar jobs