Machine Learning Engineer - Speech & Multimodal Language Modeling
The Special Projects team at Apple is developing novel user-facing features that leverage the
multimodal capabilities of state-of-the-art foundation language models. We are looking for a
highly skilled Machine Learning Engineer to build and evaluate these experiences, with a
specific focus on Multimodal and Speech Language Models. A successful candidate is
experienced in evaluating complex foundation model-driven systems end-to-end, translating
subjective product requirements into objective criteria, has strong statistical analysis skills, and
has worked with Speech Language Models.
Minimum Qualifications
Master’s degree in Computer Science or Machine Learning
2+ years of hands-on experience building and evaluating generative AI models
Proficiency in Python and ML frameworks (Pytorch or Tensorflow)
Preferred Qualifications
PhD in Computer Science, Machine Learning, Statistics, or other STEM field
5+ years of hands-on experience with SpeechLMs or LLMs
Experience with large-scale audio data processing on distributed systems
Experience with prompt evaluation and optimization for generative AI models
Proficiency in training, fine-tuning, and evaluation of foundation models and frameworks
A track record of publications or technical presentations in Machine Learning journals or
conferences
Excellent communication skills and cross-functional collaboration