SRE/DevOps Engineer

We are looking for a SRE / DevOps Engineer to build and scale enterprise\-grade cloud platforms. This is a balanced role (70% engineering, 30% operations) focused on:
<\/span><\/div>
  • building reliable and scalable cloud infrastructure,<\/span>
    <\/span><\/li>
  • driving automation and platform engineering,<\/span>
    <\/span><\/li>
  • improving observability and operational maturity,<\/span>
    <\/span><\/li>
  • enabling resilient production systems.<\/span><\/span>
    <\/li><\/ul>
    This role is ideal for engineers who are curious about how systems behave in production, enjoy debugging and automation, and want to grow into strong Site Reliability Engineers over time.
    <\/div>
    Work closely with senior engineers and platform teams to improve reliability, scalability, deployment workflows, and production operations across enterprise\-grade cloud platforms.
    <\/div>

    <\/div>
    Key Responsibilities<\/b>
    <\/div>
    • Design and implement scalable AWS infrastructure for production systems
      <\/span><\/li>
    • Build Infrastructure\-as\-Code modules for consistent and reproducible environments
      <\/span><\/li>
    • Develop and maintain CI/CD pipelines for deployment, testing, and validation
      <\/span><\/li><\/ul>
      Build automation for: <\/b>deployment workflows, system health verification, smoke testing, recovery validation and operational efficiency
      <\/span><\/div>
      • Contribute to monitoring, alerting, logging, and observability systems
        <\/span><\/li>
      • Participate in production issue debugging, incident response, and system stability improvements
        <\/span><\/li>
      • Collaborate across engineering teams to improve platform capabilities and operational maturity
        <\/span><\/li>
      • Contribute to scalable and cost\-aware cloud infrastructure design alongside senior engineers
        <\/span><\/li>
      • Improve reliability and reduce operational toil through scripting, automation, and reusable tooling
        <\/span><\/li>
      • Work on secure, resilient, and highly available cloud environments
        <\/span><\/li>
      • Support modernization and improvement of existing systems without disrupting production stability
        <\/span><\/li><\/ul>

        <\/div>

        <\/span><\/div><\/span>

        Requirements<\/h3>
        • 2\u20134 years of experience in DevOps / SRE / Cloud Engineering roles
          <\/span><\/li><\/ul>
          Strong hands\-on experience with:<\/b>
          <\/div>
          AWS production environments
          <\/div>
          Infrastructure\-as\-Code (Terraform or CloudFormation)
          <\/div>
          CI/CD pipelines (Jenkins, GitHub Actions, or similar)
          <\/div>

          <\/div>
          • Strong scripting/programming skills in Python or Bash (must)
            <\/span><\/li>
          • Proven experience with debugging production issues, improving system stability, automating infrastructure or operational workflows, and working with cloud\-native systems.
            <\/span><\/li>
          • Good understanding of distributed systems, cloud architecture, observability, scalability, and cost\-aware infrastructure practices.
            <\/span><\/li>
          • Familiarity with containerized environments such as Docker, Kubernetes, or ECS
            <\/span><\/li>
          • Strong problem\-solving mindset with willingness to learn and take ownership
            <\/span><\/li>
          • Good communication and collaboration skills<\/span>
            <\/li><\/ul>
            Tech Stack<\/b>
            <\/div>
            • AWS (RDS, Lambda, EventBridge, ECS/Kubernetes, CloudWatch, IAM, VPC)
              <\/span><\/li>
            • Terraform / CloudFormation (IaC)
              <\/span><\/li>
            • CI/CD: Jenkins, GitHub Actions
              <\/span><\/li>
            • Observability: CloudWatch, Prometheus, Grafana
              <\/span><\/li>
            • Scripting/Development: Python, Bash (Node.js a plus)
              <\/span><\/li>
            • Chaos Engineering tools (AWS FIS, Gremlin, etc.) are good to have, not mandatory
              <\/span><\/li><\/ul>
              Good to Have<\/b>
              <\/div>
              • Exposure to production incident handling or on\-call support
                <\/span><\/li>
              • Experience with Kubernetes or ECS
                <\/span><\/li>
              • Exposure to monitoring, alerting, and observability tooling
                <\/span><\/li>
              • Basic understanding of reliability engineering concepts
                <\/span><\/li>
              • Exposure to database operations, backup/recovery, or disaster recovery concepts
                <\/span><\/li>
              • Background in backend engineering before moving to DevOps/SRE
                <\/span><\/li>
              • Curiosity toward automation, reliability engineering, and platform scalability
                <\/span><\/li><\/ul>

                <\/div><\/span>

                Benefits<\/h3>
                • Opportunity to work on large\-scale cloud platforms and mission\-critical systems
                  <\/span><\/li>
                • Work closely with experienced SRE and platform engineering teams
                  <\/span><\/li>
                • Exposure to advanced areas such as Digital Twin, AI/ML systems, and cloud\-native architectures
                  <\/span><\/li>
                • Opportunity to grow into reliability engineering and platform ownership roles
                  <\/span><\/li>
                • Work with a collaborative and engineering\-focused team culture
                  <\/span><\/li>
                • Be part of a company passionate about solving real engineering problems through technology
                  <\/span><\/span><\/span><\/li><\/ul>

                  <\/div><\/span>