SRE/DevOps Engineer

We are looking for a SRE / DevOps Engineer to build and scale enterprise\-grade cloud platforms. This is a balanced role (70% engineering, 30% operations) focused on:
<\/span><\/div>

building reliable and scalable cloud infrastructure,<\/span>
<\/span><\/li>
driving automation and platform engineering,<\/span>
<\/span><\/li>
improving observability and operational maturity,<\/span>
<\/span><\/li>
enabling resilient production systems.<\/span><\/span>
<\/li><\/ul>
This role is ideal for engineers who are curious about how systems behave in production, enjoy debugging and automation, and want to grow into strong Site Reliability Engineers over time.
<\/div>
Work closely with senior engineers and platform teams to improve reliability, scalability, deployment workflows, and production operations across enterprise\-grade cloud platforms.
<\/div>

<\/div>
Key Responsibilities<\/b>
<\/div>
Design and implement scalable AWS infrastructure for production systems
<\/span><\/li>
Build Infrastructure\-as\-Code modules for consistent and reproducible environments
<\/span><\/li>
Develop and maintain CI/CD pipelines for deployment, testing, and validation
<\/span><\/li><\/ul>
Build automation for: <\/b>deployment workflows, system health verification, smoke testing, recovery validation and operational efficiency
<\/span><\/div>
Contribute to monitoring, alerting, logging, and observability systems
<\/span><\/li>
Participate in production issue debugging, incident response, and system stability improvements
<\/span><\/li>
Collaborate across engineering teams to improve platform capabilities and operational maturity
<\/span><\/li>
Contribute to scalable and cost\-aware cloud infrastructure design alongside senior engineers
<\/span><\/li>
Improve reliability and reduce operational toil through scripting, automation, and reusable tooling
<\/span><\/li>
Work on secure, resilient, and highly available cloud environments
<\/span><\/li>
Support modernization and improvement of existing systems without disrupting production stability
<\/span><\/li><\/ul>

<\/div>

<\/span><\/div><\/span>
Requirements<\/h3>
2\u20134 years of experience in DevOps / SRE / Cloud Engineering roles
<\/span><\/li><\/ul>
Strong hands\-on experience with:<\/b>
<\/div>
AWS production environments
<\/div>
Infrastructure\-as\-Code (Terraform or CloudFormation)
<\/div>
CI/CD pipelines (Jenkins, GitHub Actions, or similar)
<\/div>

<\/div>
Strong scripting/programming skills in Python or Bash (must)
<\/span><\/li>
Proven experience with debugging production issues, improving system stability, automating infrastructure or operational workflows, and working with cloud\-native systems.
<\/span><\/li>
Good understanding of distributed systems, cloud architecture, observability, scalability, and cost\-aware infrastructure practices.
<\/span><\/li>
Familiarity with containerized environments such as Docker, Kubernetes, or ECS
<\/span><\/li>
Strong problem\-solving mindset with willingness to learn and take ownership
<\/span><\/li>
Good communication and collaboration skills<\/span>
<\/li><\/ul>
Tech Stack<\/b>
<\/div>
AWS (RDS, Lambda, EventBridge, ECS/Kubernetes, CloudWatch, IAM, VPC)
<\/span><\/li>
Terraform / CloudFormation (IaC)
<\/span><\/li>
CI/CD: Jenkins, GitHub Actions
<\/span><\/li>
Observability: CloudWatch, Prometheus, Grafana
<\/span><\/li>
Scripting/Development: Python, Bash (Node.js a plus)
<\/span><\/li>
Chaos Engineering tools (AWS FIS, Gremlin, etc.) are good to have, not mandatory
<\/span><\/li><\/ul>
Good to Have<\/b>
<\/div>
Exposure to production incident handling or on\-call support
<\/span><\/li>
Experience with Kubernetes or ECS
<\/span><\/li>
Exposure to monitoring, alerting, and observability tooling
<\/span><\/li>
Basic understanding of reliability engineering concepts
<\/span><\/li>
Exposure to database operations, backup/recovery, or disaster recovery concepts
<\/span><\/li>
Background in backend engineering before moving to DevOps/SRE
<\/span><\/li>
Curiosity toward automation, reliability engineering, and platform scalability
<\/span><\/li><\/ul>

<\/div><\/span>
Benefits<\/h3>
Opportunity to work on large\-scale cloud platforms and mission\-critical systems
<\/span><\/li>
Work closely with experienced SRE and platform engineering teams
<\/span><\/li>
Exposure to advanced areas such as Digital Twin, AI/ML systems, and cloud\-native architectures
<\/span><\/li>
Opportunity to grow into reliability engineering and platform ownership roles
<\/span><\/li>
Work with a collaborative and engineering\-focused team culture
<\/span><\/li>
Be part of a company passionate about solving real engineering problems through technology
<\/span><\/span><\/span><\/li><\/ul>

<\/div><\/span>