Senior Applied ML Researcher - Video Apps

Design and train deep neural networks for video, image, audio, and audio-visual tasks. Build models for audio-visual representation learning, cross-modal alignment, and fusion. Develop solutions for tasks such as: Video understanding and temporal modeling. Audio-visual event detection. Speech, sound, and scene understanding. Multimodal classification, detection, and localization. Minimum Qualifications MS in Computer Science, Machine Learning, or a related field, or equivalent practical experience 4+ years of experience in deep learning or machine learning engineering strong expertise in deep neural networks and modern training workflows 8 years + Hands-on experience with computer vision and/or audio modeling Proficiency in Python and deep learning frameworks (PyTorch preferred) Solid understanding of linear algebra, probability, and optimization Ability to build intuition from problem statement and translate to dataset requirement, neural network design and loss functions Preferred Qualifications PhD in computer science, machine learning, or a related field, or equivalent practical experience. Publications in top-tier ML conferences (e.g., NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, etc.) Experience with self-supervised or foundation model pre-training Open-source contributions in vision, audio, or multimodal AI Bonus: Experience with Objective-C and/or Swift for on-device deployment

Similar jobs