AI Agent Architect

Emergent builds autonomous coding agents that replace traditional software development by generating, testing, and deploying production applications directly from plain-language intent. Our systems run in production at global scale and are used to build millions of real applications.

Since our public launch, we've crossed $100M in ARR and grown to over 10M users across 190+ countries. We're backed by Khosla Ventures, SoftBank, Google, Lightspeed, Prosus, Together, and Y Combinator.

We're solving the hard part of AI-driven software creation: correctness, reliability, security, and scale in real production systems. The team is built by repeat founders, Olympiad medalists, IIT & IIM alumni, and leaders from Google, Amazon, and Dropbox.

We're hiring builders who want ownership, speed, and impact at global scale.

The Role:
We're building AI agents that plan, build, test, and ship real software at global scale. Your job is to turn raw LLM and system capability into measurable, shippable gains in how well our agents actually perform. You own the loop end-to-end: what we try, how we prove it's better, what goes live, and what gets rolled back. This role sits at the intersection of product, engineering, and applied research. You won't just ship features. You'll decide what makes the agent better, how we measure it, and what's ready to go live at the scale of millions of real applications.

What You'll Do:

Develop deep, evidence-grounded intuition for how the agent thinks, succeeds, and fails across the full range of real-world use
Mine production behavior for the failure modes, regressions, and bottlenecks most teams never see, and turn them into clear, quantitative signal
Define and run high-leverage experiments that improve agent quality, reliability, and code outcomes, spanning prompt tuning, eval dataset creation, experimentation, and harness engineering
Build and evolve evaluation frameworks that measure agent quality at scale: define the metric, build the dataset, validate it against known signals, and ship dashboards that make regressions impossible to miss
Ship with rigor through clear metrics, evaluation gates, staged rollouts, and explicit rollback criteria
Drive the hard problems in context engineering, memory systems, tool use, and long-horizon execution, where agent reliability is actually won or lost
Make hard calls in subjective, probabilistic systems: when a regression is real, when a win is noise, when a benchmark is overfit, when to ship on mixed signals, and when to kill a promising direction
Think like the agent and continuously make it smarter, more reliable, and more useful

Who You Are:

5+ years building and shipping software, with real end-to-end ownership
At least 1 year of hands-on experience with agentic systems and their evaluation
Strong systems thinking paired with sharp product judgment
Hands-on with experimentation, metrics, and debugging, fluent in Python, SQL, or similar for analysis
Comfortable reasoning about noise, confounds, distribution shift, and selection effects, you know when to trust a number and when to suspect it
Energized by the long tail, sifting through large volumes of agent behavior to find the rare, hidden failure mode is the fun part for you
Deep, current interest in LLMs, agents, and emerging model capabilities, you know what shipped last week and why it matters
An independent operator with leadership presence, you scope your own work and push back on weak ideas, including your manager's
Bias toward velocity without compromising honest measurement, you know which corners are safe to cut and which are load-bearing

Ideal For:

Principal or Staff ICs, software architects, and lead builders
Engineers who moved into product and never stopped building
Technical cofounders who have owned systems end-to-end and want to stay close to the work

Benefits and Perks:

Daily Meals: Lunch and Dinner provided
Family Insurance: 3 Lakhs worth of coverage for you and your family
Unlimited Paid Time Off: Take the time you need to recharge and come back refreshed
Flexible Working Hours: Work arrangements that fit your life and commitments

Let's build the future of software together.