×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in Johannesburg, 2000, South Africa
Listing for: RSAWEB
Full Time position
Listed on 2025-12-14
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, SRE/Site Reliability, Network Engineer
Job Description & How to Apply Below

Established in 2001, RSAWEB is South Africa’s fastest growing internet service provider (ISP) with a focus on providing connectivity to home customers, and a wide array of technology solutions to businesses. We are obsessed about ensuring all our customers receive the best possible digital experience and exceptional customer service. Thousands of customers have given RSAWEB a 5-star rating, with an average rating of 4.7 out of 5 on Google – the best-rated ISP in South Africa.

We are extremely proud of winning KFM’s Best of the Cape Awards:
Best ISP in 2021 and 2022 being one of the fastest streaming ISPs on Netflix and a consistently top-rated ISP on My Broadband. These accolades are not for nothing, as we constantly strive to improve our products, services, and solutions to enhance each customer’s experience. Having invested heavily in infrastructure, RSAWEB has built a strong presence in South Africa with Data Centres in Johannesburg and Cape Town.

Our Products and Services:

  • Fibre-to-the-Business (FTTB)
  • Mobile connectivity and data management
  • Cloud infrastructure and more!

At RSAWEB, we are passionate about using our creativity, to provide innovative solutions and services, that allow our customers to succeed in all areas of life. We believe that we are in the business of connecting customers and businesses with each other and a world of infinite possibility and opportunity, through technology. Our mission transcends our values through every customer, every interaction, every connection, every day.

Our values:

  • We Build Trust and Ownership
  • We Innovate Feverishly
  • We Go the Extra Mile
  • We Believe in Humility
  • We Communicate Openly & Honestly
  • We Make it Fun
  • We Teach, Grow & Learn
  • We Do More, With Less

Role

Purpose:

The Site Reliability Engineer (SRE) is responsible for ensuring the reliability, performance, scalability, and availability of RSAWEB's platforms, network services, and customer-facing systems. This role blends software engineering, infrastructure automation, and operations to deliver highly reliable services and improve the efficiency of technical teams.

Key Responsibilities 1. Reliability & System Performance

Maintain high availability and performance across platforms, services, and infrastructure.

Define, measure, and improve SLIs/SLOs/SLAs for critical systems.

Troubleshoot system and network reliability issues proactively.

2. Automation & Dev Ops Enablement

Build automation for deployments, monitoring, configuration, and operational tasks.

Improve CI/CD pipelines and assist engineers with release engineering.

Reduce manual work (toil) by implementing self-service tools and automation workflows.

Deploy, manage, and optimise cloud and on-prem infrastructure (Linux servers, virtualisation, containers).

Work with network teams to ensure resilient integration between systems and ISP network elements.

Manage and scale containerised platforms (Docker, Kubernetes).

4. Observability & Monitoring

Implement and maintain monitoring, alerting, and logging solutions (e.g., Prometheus, Grafana, ELK, Datadog).

Ensure actionable, low-noise alerting and system dashboards.

Use metrics to identify performance bottlenecks and reliability risks.

Participate in incident response, including root cause analysis and corrective actions.

Improve monitoring and automation to prevent repeated issues.

Assist with on-call rotations to support critical services.

6. Security & Compliance

Implement security best practices across systems and deployments.

Support vulnerability scanning, patching, and secure configurations.

Ensure compliance with internal and industry standards (ISO, POPIA, etc).

Work closely with Network Engineering, Dev Ops, Software Development, and NOC teams.

Provide technical guidance in system design, scalability, and reliability improvements.

Improve operational processes through documentation and automation.

Requirements

Minimum Qualifications

Diploma or degree in Computer Science, Engineering, Information Technology, or related field.

Relevant certifications (AWS/Azure/GCP, Linux, Kubernetes, Terraform) are beneficial.

Experience Requirements

3–5+ years in SRE, Dev Ops, Systems Engineering, or Infrastructure roles.

Experience…

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary