Site Reliability Engineer Job Charlotte area,North Carolina USA,IT/Tech

Looking for local candidates

Want to work in technology in the financial industry?

Our client is seeking a highly motivated Site Reliability Engineer responsible for ensuring reliability, scalability, and performance of large-scale systems and applications. The role blends software engineering, infrastructure engineering, and production support, with a strong focus on automation and observability.

Key Responsibilities Reliability & Production Ownership

Define and track service reliability goals (SLIs/SLOs) across applications
Ensure high availability, scalability, and performance of systems
Own production issues end-to-end and ensure problems do not recur

Observability & Monitoring

Design monitoring, logging, and tracing systems (dashboards, alerts)
Enhance operational visibility into platform performance
Evaluate and improve monitoring coverage for new releases

Automation & Efficiency (Toil Reduction)

Automate manual operational tasks and workflows
Build tools/software to reduce “toil” and improve efficiency
Implement CI/CD pipelines and automation frameworks

Incident Management & Root Cause Analysis

Participate in major incident triage and troubleshooting
Identify and resolve root causes of complex outages
Collaborate with problem management teams to prevent recurrence

Collaboration Across Teams

Work closely with software engineering, infrastructure, and architecture teams
Influence adoption of reliable design patterns and best practices
Drive early integration of non-functional requirements (reliability, scalability)

Performance & Capacity Planning

Identify bottlenecks, capacity constraints, and vulnerabilities
Optimize system performance and cost efficiency
Plan for growth and scaling needs

Required Qualifications

~10–15+ years in SRE, software engineering, or infrastructure engineering
Strong experience with cloud platforms (AWS/Azure)
Proven experience supporting large-scale distributed systems
Programming:
Python, Java, or .NET
Dev Ops: CI/CD tools (Jenkins, Git), Git Ops
Observability:
Splunk, Prometheus, Grafana, Dynatrace
Systems:
Linux/Unix, networking, load balancing, DNS
Service Level Indicators (SLIs) & Objectives (SLOs)
Error budgets and reliability engineering practices
Incident response and resiliency engineering
Strong collaboration and stakeholder management
Ability to lead initiatives and influence engineering culture
Problem-solving in high-pressure production environments

Benefits

Base pay rate: $140, USD

This pay rate represents mthree's good faith and reasonable estimate of the base pay for this role at the time of posting based on the locations listed in the job advertisement. It is anticipated that qualified candidates selected for a placement will receive this pay rate as a starting salary once onsite with the mthree client, however, the ultimate salary offered may be higher or lower and will be set based on a variety of non-discriminatory factors, including but not limited to geographic location, skills, and competencies.

Applicants must be currently authorized to work in the United States on a full-time basis. The Company will not sponsor applicants for work visas.

#J-18808-Ljbffr