Machine Learning Engineer

We are seeking a skilled and forward-looking ML Engineer with experience in Large Language Models (LLMs), generative AI, and agentic architectures to join our growing R&D and Applied AI team. This role is critical in helping Oversight deliver the next generation of agentic AI systems for enterprise spend management and risk controls.
The ideal candidate has a strong foundation in machine learning, modern deep learning frameworks, and data pipelines, coupled with hands-on experience experimenting with LLMs, small language models (SLMs), multi-agent frameworks, and retrieval-augmented generation (RAG).

You will work closely with AI/ML researchers, data engineers, and product teams to design, implement, and optimize models that power autonomous exception resolution, anomaly detection, and explainable insights. This is a hands-on engineering role where you will not only build and scale ML systems but also actively contribute to cutting-edge applied research in agentic AI.

Core ML/LLM Engineering

  • Contribute to the design, training, fine-tuning, and deployment of ML/LLM models for production.
  • Implement RAG pipelines using vector databases.
  • Work with frameworks like LangChain, LangGraph, MCP to prototype and optimize multi-agent workflows.
  • Develop prompt engineering, optimization, and safety techniques for agentic LLM interactions.
  • Integrate memory, evidence packs, and explainability modules into agentic pipelines.
  • Work hands-on with multiple LLM ecosystems:
    • OpenAI GPT models (GPT-4, GPT-4o, fine-tuned GPTs).
    • Anthropic Claude (Claude 2/3 for reasoning and safety-aligned workflows).
    • Google Gemini (multimodal reasoning, advanced RAG integration).
    • Meta LLaMA (fine-tuned/custom models for domain-specific tasks).

Data & Infrastructure

  • Collaborate with Data Engineering to build and maintain real-time and batch data pipelines that serve ML/LLM workloads.
  • Conduct feature engineering, preprocessing, and embeddings generation for structured and unstructured data.
  • Implement model monitoring, drift detection, and retraining pipelines.
  • Leverage cloud ML platforms (AWS Sagemaker, Databricks ML) for experimentation and scaling.

Research & Applied Innovation

  • Explore and evaluate emerging LLM/SLM architectures and agent orchestration patterns.
  • Experiment with generative AI and multimodal models to extend capabilities beyond text (images, structured financial data).
  • Collaborate with R&D to prototype autonomous resolution agents, anomaly detection models, and reasoning engines.
  • Translate research prototypes into production-ready components.

Collaboration & Delivery

  • Work cross-functionally with R&D, Data Science, Product, and Engineering to deliver business-aligned AI features.
  • Participate in design reviews, architecture discussions, and model evaluations.
  • Document processes, experiments, and results effectively for knowledge sharing.
  • Mentor junior engineers and contribute to ML engineering best practices.
Required
  • Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or related field.
  • 3+ years of experience building and deploying ML systems.
  • Proficiency in Python and libraries such as PyTorch, TensorFlow, Scikit-Learn, Hugging Face Transformers.
  • Hands-on experience with LLMs/SLMs (fine-tuning, prompt design, inference optimization).
  • Demonstrated experience with at least two of the following ecosystems:
    1. OpenAI GPT models (chat, assistants, fine-tuning).
    2. Anthropic Claude (safety-first AI for reasoning and summarization).
    3. Google Gemini (multimodal reasoning, enterprise-scale APIs).
    4. Meta LLaMA (open-source, fine-tuned models).
  • Familiarity with vector databases, embeddings, and RAG pipelines.
  • Ability to work with structured and unstructured data at scale.
  • Knowledge of SQL and distributed data frameworks (Spark, Ray).
  • Strong understanding of ML lifecycle: data prep, training, evaluation, deployment, monitoring.

Preferred Qualifications

  • Experience with agentic frameworks (LangChain, LangGraph, MCP, AutoGen).
  • Knowledge of AI safety, guardrails, and explainability techniques.
  • Hands-on experience deploying ML/LLM solutions in cloud environments (AWS, GCP, Azure).
  • Experience with CI/CD for ML (MLOps), monitoring, and observability.
  • Familiarity with anomaly detection, fraud/risk modeling, or behavioral analytics.
  • Contributions to open-source AI/ML projects or publications in applied ML research.
US Army:
17D – Cyber Capability Developer
17C – Cyber Operations Specialist (Advanced Track)
35Q – Cryptologic Network Warfare Specialist
35N / 35P / 35S (Intel Analysts w/ coding exposure)

US AirForce:
17X – Cyberspace Warfare Operations
1B4X1 – Cyber Warfare Operations
9S100 – Scientific Applications Specialist
3D0X4 / 1D7X1 (Software / Data Ops variants)

US Navy:
CTN – Cryptologic Technician (Networks)
CTI / CTR (with analytics focus)
Information Warfare Officers (1810)

US Marine Corps:
1721 – Cyberspace Warfare Operator
26XX Intel (with data/automation focus)

US Space Force:
Cyber Operations (DCO/OCO) Guardians