Machine Learning Data Analyst

You will design, build, and maintain automated data pipelines for collection, labeling, validation, and metric computation to support ML training and evaluation. You will establish and monitor data and labeling quality standards, perform accuracy audits and root-cause analysis, and implement automated model evaluation metrics and reporting. You will build scalable systems for performance tracking, dashboards, and monitoring, develop and operate workflow orchestration (Airflow, Prefect, or similar), write clean Python and performant SQL for large datasets (including Redshift and related AWS tooling), and collaborate with ML engineers, analysts, and product stakeholders to prioritize work and unblock execution.

Responsibilities

  • Design, build, and maintain automated data pipelines for collection, labeling, validation, and metric computation that support ML training and evaluation
  • Establish and monitor data and labeling quality standards and drive consistency checks, accuracy audits, and root-cause analysis
  • Define, implement, and automate model evaluation metrics and reporting that reflect real-world product use cases and business goals
  • Build scalable systems for performance tracking, dashboards, and monitoring to enable fast, data-driven decisions
  • Develop and operate reliable workflow orchestration (Airflow, Prefect, or similar) to schedule, observe, and troubleshoot end-to-end pipelines
  • Write clean, maintainable Python code and performant SQL to process large datasets, leveraging AWS Redshift and related AWS tooling
  • Partner closely with ML engineers, analysts, and product stakeholders to prioritize work, unblock execution, and improve internal tooling for analysis and evaluation

Requirements

  • 3+ years of experience as a Data Analyst or in a similar data infrastructure role
  • Strong Python programming skills with focus on clean, maintainable code
  • Solid SQL expertise and experience with cloud or columnar databases such as AWS Redshift
  • Hands-on experience with workflow orchestration tools such as Airflow, Prefect, or Dagster
  • Proven experience in data quality management, data preparation, or ML data pipelines
  • Understanding of metric computation, data labeling, and automation in ML workflows
  • Strong collaboration and problem-solving skills
  • Background in mathematics, physics, or engineering

Benefits

  • Flexible working hours and workplace
  • Open vacation policy