Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms

Work along side Foundation Model Research team to optimize inference for cutting edge model architectures. Work closely with product teams to build Production grade solutions to launch models serving millions of customers in real time. Build tools to understand bottlenecks in Inference for different hardwares and use cases. Mentor and guide engineers in the organization. Minimum Qualifications 5+ years of experience leading and driving complex, ambiguous projects. Experience with LLM inference stack Familiarity with GPU programming concepts using CUDA. Familiarity with one of the popular ML Frameworks like Pytorch, Tensorflow. Have experience with high throughput services particularly at supercomputing scale. Proficient with running applications on Cloud (AWS / Azure or equivalent) using Kubernetes, Docker etc. Familiar with one of the popular ML Frameworks like Pytorch, Tensorflow. BS in Computer Science, Artificial Intelligence, Machine Learning, Information Retrieval, Data Science or related field Preferred Qualifications Proficient in building and maintaining systems written in modern languages (eg: Golang, Python) Familiar with fundamental Deep Learning architectures such as Transformers, Encoder/Decoder models. Familiarity with Nvidia TensorRT-LLM, vLLM, DeepSpeed, Nvidia Triton Server etc. Experience writing custom CUDA kernels using CUDA or OpenAI Triton. MS in Computer Science, Artificial Intelligence, Machine Learning, Information Retrieval, Data Science or related field.

Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms

Similar jobs

Machine Learning Engineer, Foundation Model Services

Senior Research Engineer, Training Data Infrastructure in Foundation Models

Staff Machine Learning Engineer - Tools & Frameworks AI

Machine Learning Systems Engineer, Siri Agent Modeling

Machine Learning Engineer

Staff Machine Learning Engineer