Data Scientist
As a Data Scientist focused on Algorithm Evaluation, you will serve as a technical leader responsible for driving end-to-end evaluation strategy for complex algorithmic systems. You will develop rigorous methodologies to assess algorithm quality, identify failure patterns, and quantify system behavior across large-scale datasets and real-world scenarios.
You will lead deep dives into algorithm performance, uncover insights through advanced statistical analysis, and establish scalable frameworks to improve evaluation efficiency and confidence in product decisions. You will also help shape how agentic solutions and AI-assisted tooling are integrated into day-to-day workflows to accelerate data analysis, failure investigation, annotation quality improvement, root-cause discovery, and evaluation automation.
This role requires strong technical depth, exceptional analytical rigor, and the ability to influence cross-functional teams in highly ambiguous environments.
Minimum Qualifications
BS and a minimum of 10 years relevant industry experience
7+ years of experience in data science, machine learning evaluation, algorithm analysis, or related technical disciplines.
Demonstrated experience driving technical initiatives in ambiguous, cross-functional environments.
Strong expertise in statistical analysis, experimentation methodologies, and large-scale data analytics.
Deep experience evaluating machine learning, computer vision, or AI systems through quantitative metrics and performance analysis.
Strong programming experience in Python, with hands-on experience building scalable analytics and automation pipelines.
Experience conducting algorithm deep dives, failure analysis, and model performance investigations.
Familiarity with AI-assisted analysis workflows, foundation models, agentic systems, or intelligent automation approaches for technical problem solving.
Strong understanding of algorithm evaluation concepts, including precision/recall tradeoffs, confusion analysis, robustness measurement, regression detection, and benchmarking methodologies.
Exceptional problem-solving skills with ability to translate ambiguous technical problems into measurable frameworks.
Preferred Qualifications
Experience evaluating machine learning, computer vision, multimodal, or foundation model systems in production environments.
Experience designing or deploying agentic workflows to improve engineering productivity, data analysis, evaluation efficiency, or annotation quality.
Familiarity with LLM-based systems, retrieval pipelines, structured reasoning, or AI-assisted analytics frameworks.
Experience defining quality frameworks and evaluation methodologies for large-scale intelligent systems.
Experience building automated benchmarking systems and large-scale performance monitoring infrastructure.
Knowledge of A/B experimentation, causal inference, and advanced statistical modeling.
Strong understanding of the ML lifecycle, model validation, and continuous evaluation methodologies.
Excellent communication skills with proven ability to influence technical decisions through data-driven insights.