Machine Learning Engineer

We are seeking a skilled and forward-looking ML Engineer with experience in Large Language Models (LLMs), generative AI, and agentic architectures to join our growing R&D and Applied AI team. This role is critical in helping Oversight deliver the next generation of agentic AI systems for enterprise spend management and risk controls.

The ideal candidate has a strong foundation in machine learning, modern deep learning frameworks, and data pipelines, coupled with hands-on experience experimenting with LLMs, small language models (SLMs), multi-agent frameworks, and retrieval-augmented generation (RAG).

You will work closely with AI/ML researchers, data engineers, and product teams to design, implement, and optimize models that power autonomous exception resolution, anomaly detection, and explainable insights. This is a hands-on engineering role where you will not only build and scale ML systems but also actively contribute to cutting-edge applied research in agentic AI.

Core ML/LLM Engineering

Contribute to the design, training, fine-tuning, and deployment of ML/LLM models for production.
Implement RAG pipelines using vector databases.
Work with frameworks like LangChain, LangGraph, MCP to prototype and optimize multi-agent workflows.
Develop prompt engineering, optimization, and safety techniques for agentic LLM interactions.
Integrate memory, evidence packs, and explainability modules into agentic pipelines.
Work hands-on with multiple LLM ecosystems:
- OpenAI GPT models (GPT-4, GPT-4o, fine-tuned GPTs).
- Anthropic Claude (Claude 2/3 for reasoning and safety-aligned workflows).
- Google Gemini (multimodal reasoning, advanced RAG integration).
- Meta LLaMA (fine-tuned/custom models for domain-specific tasks).

Data & Infrastructure

Collaborate with Data Engineering to build and maintain real-time and batch data pipelines that serve ML/LLM workloads.
Conduct feature engineering, preprocessing, and embeddings generation for structured and unstructured data.
Implement model monitoring, drift detection, and retraining pipelines.
Leverage cloud ML platforms (AWS Sagemaker, Databricks ML) for experimentation and scaling.

Research & Applied Innovation

Explore and evaluate emerging LLM/SLM architectures and agent orchestration patterns.
Experiment with generative AI and multimodal models to extend capabilities beyond text (images, structured financial data).
Collaborate with R&D to prototype autonomous resolution agents, anomaly detection models, and reasoning engines.
Translate research prototypes into production-ready components.

Collaboration & Delivery

Work cross-functionally with R&D, Data Science, Product, and Engineering to deliver business-aligned AI features.
Participate in design reviews, architecture discussions, and model evaluations.
Document processes, experiments, and results effectively for knowledge sharing.
Mentor junior engineers and contribute to ML engineering best practices.

Required

Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or related field.
3+ years of experience building and deploying ML systems.
Proficiency in Python and libraries such as PyTorch, TensorFlow, Scikit-Learn, Hugging Face Transformers.
Hands-on experience with LLMs/SLMs (fine-tuning, prompt design, inference optimization).
Demonstrated experience with at least two of the following ecosystems:
1. OpenAI GPT models (chat, assistants, fine-tuning).
2. Anthropic Claude (safety-first AI for reasoning and summarization).
3. Google Gemini (multimodal reasoning, enterprise-scale APIs).
4. Meta LLaMA (open-source, fine-tuned models).
Familiarity with vector databases, embeddings, and RAG pipelines.
Ability to work with structured and unstructured data at scale.
Knowledge of SQL and distributed data frameworks (Spark, Ray).
Strong understanding of ML lifecycle: data prep, training, evaluation, deployment, monitoring.

Preferred Qualifications

Experience with agentic frameworks (LangChain, LangGraph, MCP, AutoGen).
Knowledge of AI safety, guardrails, and explainability techniques.
Hands-on experience deploying ML/LLM solutions in cloud environments (AWS, GCP, Azure).
Experience with CI/CD for ML (MLOps), monitoring, and observability.
Familiarity with anomaly detection, fraud/risk modeling, or behavioral analytics.
Contributions to open-source AI/ML projects or publications in applied ML research.

US Army:
17D – Cyber Capability Developer
17C – Cyber Operations Specialist (Advanced Track)
35Q – Cryptologic Network Warfare Specialist
35N / 35P / 35S (Intel Analysts w/ coding exposure)

US AirForce:
17X – Cyberspace Warfare Operations
1B4X1 – Cyber Warfare Operations
9S100 – Scientific Applications Specialist
3D0X4 / 1D7X1 (Software / Data Ops variants)

US Navy:
CTN – Cryptologic Technician (Networks)
CTI / CTR (with analytics focus)
Information Warfare Officers (1810)

US Marine Corps:
1721 – Cyberspace Warfare Operator
26XX Intel (with data/automation focus)

US Space Force:
Cyber Operations (DCO/OCO) Guardians