Research Intern: Interpretability & Reliability (Summer 2027)

About CTGT & The Mission

Despite massive investment in commercial AI, organizations often find that demonstrated value is elusive, primarily due to the non-deterministic risk inherent to generative models. CTGT is the deterministic governance layer that enables the most important global institutions to deploy AI workflows with confidence.

Born out of Stanford University research, we provide the control plane that makes it possible. A lightweight, model-agnostic system that enforces policy, prevents drift, and produces auditable decisions in real time. When benchmarked on HaluEval, the CTGT Policy Engine (paired with GPT-120B OSS) outperformed frontier models (Gemini 3 Pro Preview, Claude 4.5 Opus and 4.5 Sonnet) at drastically lower compute cost.

While we sit on the edge of AI research, CTGT brings frontier intelligence into real-world environments. We apply cutting-edge theory directly in production to make large language models more reliable, controllable, and performant in practice.

Our mission is to bring models to the level of performance and accountability required by the Fortune 500. By bridging the gap between LLM capabilities and domain-specific requirements, we unlock the true potential of generative AI to solve the most pressing problems in our world today.

The Role

Not your average fixed-point internship.

Frontier models are now usually right and occasionally confidently wrong, and they cannot tell you which is which. A model that is 95% reliable is useless in the settings we serve, the same way a self-driving car that avoids most accidents is useless. CTGT's research function exists to close that gap. Our founding research stems from feature learning in neural networks, and we use that machinery to extract and steer features at runtime, on open and closed-weight models, without training a new artifact for every behavior.

As a research intern, you will own one hard problem inside this program from end to end. You will not be handed a labeling task or a notebook to babysit. You will take a real research question, like a better way to find what a model represents, intervene on it, or bound how wrong the system can be, design an approach, implement it against real models, and prove or disprove it with evidence that holds up. You will sit directly with the engineers building the Policy Engine, present in our weekly research review, and be expected to form opinions, ask hard questions, and take problems further than they were handed to you.

We hold interns to the standard of a calculation that has to be right, not a demo that usually works; in practice this means limited ground truth, unverifiable intermediate steps, and failure modes that hide in the tails.

What You Will Do

Implement and stress-test methods for feature extraction and runtime intervention, from control vectors to activation probes, and make them work repeatably across model families
Design evaluations that bound error rather than average it: calibration under imbalanced data, reasoning-trace grading, behavior in verifiable and non-verifiable task regimes
Read the relevant literature, decide what actually matters, reproduce it, and push past it
Work with engineering to turn a finding into a Policy Engine capability that ships into audited, high-stakes environments
Present your progress every week and defend your reasoning

Who You Are

Pursuing a degree (Bachelor's through PhD) in computer science, mathematics, the sciences, or a similarly unforgiving quantitative field. We care how you think, not your titles.
Strong mathematical foundations: linear algebra, probability, optimization, information theory
Can read a paper, decide what matters, and implement it
Have written real code for real computational systems; fluency with PyTorch and the modern ML stack, or the track record that says you will have it in weeks
Drawn to interpretability, model internals, and making systems provably reliable rather than usually fine
Self-directed, and able to make real progress without constant scaffolding

Our Stack

Languages: Python, Rust, and Node/TypeScript, with React on the frontend
Data: PostgreSQL, vector, and graph databases
Infra: Docker, Kubernetes, Terraform, across several cloud providers and customer VPCs
ML: Self-hosted models on multiple GPU providers and frontier APIs

Logistics

Full-time, in person in San Francisco
10 to 12 weeks between May/June and August/September 2027
We sponsor US visas

What We Offer

World-Class Backing: You will join a venture-backed company with institutional investors including Google's Gradient Ventures, General Catalyst, and Y Combinator.

Real Impact: You will work directly on the core systems that determine how models perform in the wild. Your work ships into real, high-stakes environments where governance, auditability, and performance are non-negotiable.

Autonomy & Trust: We operate with a high degree of trust. You are expected to form strong technical opinions and execute on them.

To Apply

Apply through Work at a Startup with the single most interesting thing you have built or proven, such as a paper, a repo, or a calculation, and a sentence or two on why it mattered. We read every submission, this artifact is the signal.