Junior Site Reliability Engineer

We're building a MAS-regulated marketplace and clearing house, and we want you to grow with us.

Our Site Reliability Engineering team operates mission-critical platforms that power real-time trading and clearing services. We're looking for a hungry, hands-on engineer who's solid on the fundamentals and ready to level up fast in a high-stakes, high-support environment. You don't need to have done it all — you need to be the kind of person who figures things out, automates the boring stuff, and cares deeply about reliability.

🔧 What You'll Do

Cloud Infrastructure

Provision and manage AWS infrastructure using Terraform
Work with VPCs, subnets, security groups, and route tables to support secure network design
Support Kubernetes (K8s) cluster operations for containerised workloads
Contribute to multi-account governance under AWS Control Tower
Help configure and maintain Cloudflare for DNS, WAF, and edge security

Operations & Reliability

Monitor, patch, and tune infrastructure to keep systems healthy and performant
Participate in incident response and post-mortems
Support vulnerability identification and remediation workflows
Collaborate with the security team on IAM, encryption, and data protection practices
Learn and apply CSPM tooling to maintain cloud security posture

Automation & Collaboration

Build and improve CI/CD pipelines for infrastructure provisioning
Write scripts to automate repetitive operational tasks
Work closely with app developers to support reliable platform delivery
Document what you build — designs, runbooks, and security controls

✅ Must-Haves

Solid Linux fundamentals — you're comfortable navigating systems, reading logs, and troubleshooting from the command line
Scripting experience in at least one language — you automate things instead of doing them twice
Working knowledge of core AWS services (EC2, VPC, IAM, S3, CloudWatch)
Hands-on experience with Terraform — you can read a plan, write basic configs, and apply changes without hand-holding
Familiarity with Git and CI/CD pipelines — branching, PRs, and understanding how code gets from repo to production
An observability mindset — you know the difference between a log, a metric, and an alert, and you reach for them instinctively when something breaks
1+ years of hands-on experience in an infrastructure, cloud, or SRE-adjacent role
Eagerness to learn fast in a high-stakes, regulated environment — we'll invest in you if you invest in the work

⭐ Great to Have (we'll help you get here)

Exposure to Kubernetes — even local clusters or coursework counts
Familiarity with Cloudflare products (DNS, WAF)
Awareness of CSPM or cloud compliance frameworks (CIS, NIST, ISO 27001)
Hands-on with GitOps tools like ArgoCD or Flux
Experience with Elastic Stack for log aggregation and observability
Familiarity with PagerDuty or similar incident management platforms