Lead Engineer - AI

About HighLevel:

HighLevel is an AI powered, all-in-one white-label sales & marketing platform that empowers agencies, entrepreneurs, and businesses to elevate their digital presence and drive growth. We are proud to support a global and growing community of over 1 million businesses, comprised of agencies, consultants, and businesses of all sizes and industries. HighLevel empowers users with all the tools needed to capture, nurture, and close new leads into repeat customers. As of mid 2025, HighLevel processes over 4 billion API hits and handles more than 2.5 billion message events every day. Our platform manages over 470 terabytes of data distributed across five databases, operates with a network of over 250 microservices, and supports over 1 million hostnames.

Our People:

With over 1,500 team members across 15+ countries, we operate in a global, remote-first environment. We are building more than software; we are building a global community rooted in creativity, collaboration, and impact. We take pride in cultivating a culture where innovation thrives, ideas are celebrated, and people come first, no matter where they call home.

Our Impact:

As of mid 2025, our platform powers over 1.5 billion messages, helps generate over 200 million leads, and facilitates over 20 million conversations for the more than 1 million businesses we serve each month. Behind those numbers are real people growing their companies, connecting with customers, and making their mark - and we get to help make that happen.

About the Role:

As a Lead Engineer on the Users team, you will own day-to-day execution across some of the most business-critical and sensitive parts of the platform: permissions, tokens, and user-facing access flows.

This role is for someone who can move fast in an imperfect system, take on existing tech debt without fear, and deliver visible improvements across frontend and backend surfaces - without breaking security guarantees. You’ll work closely with Staff Engineers, Product Managers, and Designers to turn complex authorisation models into simple, reliable, and understandable user experiences.

Responsibilities:

End-to-end ownership of permissions and access control features:

-> Roles, permissions, scopes, and user assignments

-> Admin, agency, and account-level access flows

Execution on token-related workflows:

-> Token creation, rotation, revocation, and expiry flows

-> UI and backend consistency for API keys and access tokens

Tech-debt reduction in identity and access systems:

-> Untangling legacy permission logic

-> Removing implicit or duplicated authorisation paths

Full-stack delivery of security-sensitive user flows:

-> Permission visibility and debugging experiences

Improve developer ergonomics:

-> Cleaner APIs

-> Better abstractions

-> Safer defaults for other teams

Partner with Staff Engineers to operationalise architecture:

-> Turn RFCs into shippable code

-> Close gaps between design and reality

What You'll Do

Architecture & Platform Ownership

Own architecture and scaling decisions for core AI Studio platform components (e.g., App Generation Engine, React Build & Render Pipeline, Domain & Publishing Pipeline, AI Content Systems)

Lead design and implementation of cross-cutting initiatives to improve system responsiveness, generation accuracy, and platform robustness

Build scalable, fault-tolerant LLM pipelines for content generation, layout creation, experimentation, and AI-driven user guidance

AI & Distributed Systems Engineering

Work hands-on with technologies like Go, NestJS, Node, PostgreSQL, Firestore, Vector Databases, Cloudflare Workers, Cloudflare KV, and microservices

Design distributed systems that handle high-throughput generative and analytical workloads while ensuring correctness and low latency

Build and optimize embeddings pipelines, retrieval-augmented generation (RAG), and multi-agent orchestration frameworks

Quality, Observability & Reliability

Drive improvements in observability using Prometheus/Grafana, OpenTelemetry, and structured logging

Establish and maintain SLOs for generation latency, correctness, model safety, and system uptime

Strengthen resiliency and failover strategies to ensure seamless user experiences during traffic spikes and model load variations

AI Guardrails, Safety & Tooling

Implement hallucination mitigation strategies (code constraints, structured outputs, model verification layers, retrieval guards, test scaffolding)

Collaborate with infra and security to enforce data governance, access control, privacy boundaries, and compliance around user-generated data

Leverage LLMs and AI tools to write, test, and debug code — while driving standards that ensure reliability and consistency across engineering teams

Technical Leadership & Collaboration

Mentor engineers through code reviews, design discussions, and pairing, leading by influence to promote technical excellence, code quality, and a high-ownership culture (this is an individual contributor role with no direct reports)

Partner with PMs, designers, and other engineering leaders to define long-term roadmap and deliver high-performing AI-powered application-building features

Participate in reviews, deep dives, and on-call rotations to maintain a culture of accountability and operational excellence

What You'll Bring

6+ years of backend engineering experience, including distributed system design and high-scale platform development
Strong proficiency in Go (Golang) for building high-performance, concurrent backend services, alongside Node/NestJS experience
Hands-on experience building and operating edge/serverless services with Cloudflare Workers and Cloudflare KV (or comparable edge compute and distributed key-value stores)
Deep expertise in event-driven architectures, asynchronous workflows, and high-throughput data pipelines.
Strong command of relational (PostgreSQL) and NoSQL data models, query optimization, and complex transactional data
Fluency in LLM integrations, vector search, embeddings, RAG patterns, and generative AI frameworks
Familiarity with frontend architecture using Vue, and a solid understanding of UI/UX principles

Operational Excellence

Experience with monitoring, alerting, and incident response for production systems
Strong instincts for scaling, latency optimization, and reliability under real-world load

AI & LLM Safety

Experience implementing structured generation, verification layers, retrieval-based grounding, and guardrails to reduce hallucinations
Understanding of prompt engineering patterns, context injection, and model evaluation

Soft Skills

Exceptional communication and cross-functional leadership capabilities

EEO Statement:

The company is an Equal Opportunity Employer. As an employer subject to affirmative action regulations, we invite you to voluntarily provide the following demographic information. This information is used solely for compliance with government recordkeeping, reporting, and other legal requirements. Providing this information is voluntary and refusal to do so will not affect your application status. This data will be kept separate from your application and will not be used in the hiring decision.

#LI-Remote #LI-HB1