Senior ML Data Platform Developer
We are seeking a visionary and highly technical Senior ML Data Platform Developer to architect, implement, scale, and maintain the data engine powering our next-generation frontier models.
In this high-impact role, you will bridge the gap between cutting-edge AI research and high-performance engineering, treating the data platform as an internal product with our researchers as your primary customers. You will be responsible for designing a multi-tiered, ultra-low-latency storage architecture and building automated, petabyte-scale data processing pipelines. Our technical environment is not fixed and will evolve as our projects scale. We expect someone capable of evolving it, not only following industry trends, challenging it, and making sustainable decisions in close collaboration with our Research and Product teams.
Key Responsibilities
- Design and maintain a layered storage architecture and partner with the Research team to ensure seamless integration with the training pipelines.
- Scale and automate the data processing stack to handle petabytes of data and ensure its smooth operation.
- Ensure efficient use of compute resources, including GPU access for compute-intensive data processing tasks.
- Assist the Infrastructure team in provisioning the compute and storage environments to support scaling.
- Ensure all datasets, including the intermediate outputs of each transformation stage, are versioned, reproducible, and fully traceable to meet specific and dynamic experiment needs, and are accompanied by datasheets, in accordance with internal Data Governance policies.
- Collaborate with the Research team and other teams to understand their self-service needs around dataset exploration, sampling, and analysis, and develop proper tooling.
Skills and Qualifications
- A bachelor’s degree in a relevant field (e.g., computer science, computer engineering, software engineering) is required.
- 5+ years of experience in designing, implementing, and managing web-scale storage, high-performance networking (HPC), or working within large-scale distributed ML data frameworks, with recent experience using e.g. Lustre, Ray, Apache Spark, workflow orchestrators, Apache Arrow, and/or Parquet.
- Ability to collaborate effectively with cross-functional teams, document best practices, and stay updated with the latest advancements in large-scale data processing and software development.
- Experience with workload managers (e.g., Ray, Kubernetes, SLURM).
- Familiarity with containerization tools (e.g., Docker, Enroot).
- Familiarity with data infrastructures and platforms (e.g., vector databases).
What we offer
- The chance to contribute meaningfully to a globally critical initiative
- Comprehensive health benefits (including mental health and wellness management account)
- 20 days of vacation per year upon start
- Employer contribution of 4% to your retirement savings, with no required employee match
- Additional compensation totaling 8% of your salary to apply towards additional retirement savings or bonuses (independent of group and individual performance)
- A team of passionate world-class experts in their field
- A collaborative and inclusive work environment in our vibrant office space in the heart of Little Italy, in the trendy Mile-Ex district, close to public transportation
About LawZero
LawZero is a non-profit organization committed to advancing research and creating technical solutions that enable safe-by-design AI systems. Its scientific direction is based on new research and methods proposed by Professor Yoshua Bengio, the most cited AI researcher in the world. Based in Montreal, LawZero’s research aims to build non-agentic AI that learns primarily to understand the world rather than to act in it, giving truthful answers to questions based on transparent and externalized probabilistic reasoning. Such AI systems could be used to accelerate scientific discovery, to provide oversight for agentic AI systems, and to advance the understanding of AI risks and how to avoid them. LawZero believes that AI should be cultivated as a global public good—developed and used safely towards human flourishing. For more information, visit www.lawzero.org
You belong here
At LawZero, diversity is important to us. We value a work environment that is fair, open and respectful of differences. We welcome applications from highly qualified individuals interested in working towards our mission in a respectful, inclusive and collaborative setting.
Your personal information will be collected and processed by LawZero to evaluate your application for employment in compliance with our Privacy Policy. Under privacy laws in force in your country of residence, you may have several privacy rights, such as to request access to your personal information or to request that your personal information be rectified or erased. Details on how you can exercise your rights can be found in our Privacy Policy.