AIML - Sr Machine Learning Engineer, Responsible AI
This role focuses on developing, carrying-out, interpreting, and communicating pre- and post-ship evaluations of the safety of Apple Intelligence features. Both human grading and model-based auto-grading are thoughtfully leveraged to power these evaluations. Additionally, this role researches and develops auto-grading methodology & infrastructure to benefit ongoing and future Apple Intelligence safety evaluations.
Producing safety evaluations that uphold Apple’s Responsible AI values requires thoughtful data sampling, creation, and curation for evaluation datasets; high quality, detailed annotations and careful auto-grading to assess feature performance; and mindful analysis to understand what the evaluation means for the user experience.
This role heavily draws on applied data science, scientific investigation and interpretation, cross-functional communication and collaboration, and metrics reporting and presentation.
Minimum Qualifications
MS, or PhD in Computer Science, Machine Learning, Statistics, or related fields; or an equivalent qualification acquired through other avenues.
Experience working with generative models for evaluation and/or product development, and up-to-date knowledge of common challenges and failures.
Strong engineering skills and experience in writing production-quality code in Python.
Deep experience in foundation model-based AI programming (i.e.: using DSPy for optimizing foundation model prompts, for example) and a drive to innovate in this space.
Experience working with noisy, crowd-based data labels and human evaluations.
Preferred Qualifications
Experience working in the Responsible AI space.
Prior scientific research and publication experience.
Strong organizational and operational skills working with large, multi-functional, and diverse teams.
Curiosity about fairness and bias in generative AI systems, and a strong desire to help make the technology more equitable.