Site Reliability Engineer, Physical Infrastructure
The Systems and Infrastructure team builds and manages world class services and physical infrastructure for Apple software engineers world wide to build, test, and release Apple's software.
About Our Team: We are a team dedicated to engineering excellence, reusable design, and simplicity. We foster a supportive, growth-focused culture where we mentor each other and work together to build resilient, high-quality systems.
Minimum Qualifications
3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or Systems Admin focused on physical infrastructure in a large-scale distributed environment
Strong software development skills in a language like Swift, Go, or Python, and a high degree of comfort with shell scripting (Bash)
Hands-on experience building and managing systems with container orchestration tools (Kubernetes, Docker)
Deep understanding of networking (TCP/IP, DNS, HTTP) and experience using observability tools (monitoring, logging, tracing) to diagnose complex issues
Excellent problem-solving and communication skills, with a strong sense of ownership and drive
BS/MS in Computer Science, Engineering or related field
Preferred Qualifications
Build automation tools that eliminate routine tasks. Every manual process is an opportunity to code a solution
Experience with Unix/Linux systems administration and command-line diagnostic tools
Proven experience leading initiatives to reduce technical debt, refactor systems, or improve performance and latency
Expertise in performance analysis and capacity planning for physical infrastructure.
Demonstrated ability to lead incident response for high-impact outages
Familiarity with using Generative AI (GenAI) or Large Language Models (LLMs) to accelerate operational tasks, such as automating runbooks, generating scripts, or analyzing incident data