Senior Manager of Site Reliability Engineering

Guide and shape the future of technology at a globally recognized firm, driven by pride in ownership.


As a Senior Manager of Site Reliability Engineering at JPMorgan Chase within the Infrastructure Platforms-Data Protection and Recovery organization team, you are the non-functional requirement owner and champion for the applications in your remit. You are a key influencer in your team’s strategic planning, driving continual improvement in customer experience, resiliency, security, scalability, monitoring, instrumentation, and automation of the software in your area. You act in a blameless, data-driven manner and navigate difficult situations with composure and tact.

Job responsibilities

  • Demonstrates expertise in site reliability principles and demonstrates an understanding of the fine balance between features, efficiency, and stability
  • Effectively negotiates with peers and executive partners to ensure optimal outcomes for all

  • Drives reuse-first adoption of enterprise-authorized AI capabilities within the work environment to improve reliability operations and customer experience outcomes, with human-in-the-loop validation and appropriate handling of sensitive data.

  • Drives the adoption of site reliability practices throughout the organization
  • Ensures your teams demonstrate site reliability best practices with the ability to demonstrate this empirically through stability and reliability metrics
  • Drives a culture of continual improvement and solicits real-time feedback to improve the customer’s experience and

    Ensures your team collaborates with other teams within your group’s specialization and avoids duplication of work where possible

  • Follows blameless, data-driven, post-mortem strategies and conducts regular team debriefs to enable learning from both successes and mistakes
  • Provides personalized coaching for entry to mid-level team members and

    Ensures your team documents and shares their knowledge and innovations via internal forums, communities of practice, guilds, and conferences

  • Establishes team standards for AI-assisted reliability workflows across automation and delivery practices, ensuring traceability/auditability, resiliency, and security controls.

Required qualifications, capabilities, and skills

  • Formal training or certification on software engineering concepts and 5+ years applied experience
  • Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across application and platform teams while avoiding common pitfalls

  • Experience leading teams in the safe use of enterprise-authorized AI capabilities within the work environment for reliability engineering workflows, including validation habits and awareness of data sensitivity.

  • Ability to set and reinforce organization-level practices for reviewing AI-assisted recommendations and escalating uncertain decisions while maintaining resiliency, security, and auditability outcomes.

  • Experience leading technologists to manage and solve complex technological issues at a firmwide level
  • Ability to influence the team’s culture by championing innovation and change for success
  • Experience hiring, developing, and recognizing talent
  • Proficiency in at least one programming language (e.g., Python, Java Spring Boot, .Net, etc.)
  • Demonstrated proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.)
  • Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.)
  • Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.) and troubleshooting common compute, storage, and networking technologies and hardware issues

Preferred qualifications, capabilities, and skills

  • Ability to code and demonstrate data fluency
  • Experience with enterprise data protection products such as Cohesity or Commvault

Similar jobs