Site Reliability Engineer; SRE
Remote / Online - Candidates ideally in
City Of London, Central London, Greater London, England, UK
Listed on 2025-12-28
City Of London, Central London, Greater London, England, UK
Listing for:
Blackfluo.ai
Remote/Work from Home
position Listed on 2025-12-28
Job specializations:
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Support
Job Description & How to Apply Below
Location: City Of London
About the job Site Reliability Engineer (SRE)
Job Description
Location: Full remote, EU timezone (CET +/- 2 hours)
Start Date: As soon as possible
Languages: English required
We are looking for a skilled Site Reliability Engineer (SRE) with deep expertise in AWS to help us scale and secure our infrastructure. As an SRE, you will be instrumental in ensuring the reliability, performance, and scalability of our production systems. Youll work closely with engineering teams to automate operations, improve monitoring, and design resilient systems.
Responsabilities:
- Design, implement, and maintain scalable, resilient AWS infrastructure
- Develop and manage CI/CD pipelines and infrastructure-as-code (Terraform or similar)
- Set up and optimize monitoring, alerting, and incident response processes
- Proactively identify and resolve performance, reliability, and security issues
- Collaborate with development teams to integrate SRE best practices into their workflows
- Conduct post-mortems and root cause analyses on incidents
- Participate in on-call rotations to support 24/7 system reliability
Requirements:
- 5+ years of experience as an SRE or similar role
- Deep knowledge of AWS services (EC2, ECS, RDS, Lambda, S3, etc.)
- Proficient in infrastructure-as-code tools (Terraform, Cloud Formation, etc.)
- Solid experience with Linux systems administration and networking concepts
- Experience with CI/CD tools (Git Lab CI, Jenkins, etc.)
- Familiarity with observability tools (Prometheus, Grafana, Datadog, etc.)
Nice To Have:
- Experience with container orchestration (ECS, EKS, or Kubernetes)
- Understanding of security best practices in cloud environments
- Exposure to incident management frameworks (SRE handbook, etc.)
Why Join Us:
- 100% remote work with flexible hours
- High-impact role with autonomy and ownership
- Collaborative and international engineering team
- Cutting-edge tech stack with strong focus on reliability and automation.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×