×
Register Here to Apply for Jobs or Post Jobs. X

Lead Site Reliability Engineer

Remote / Online - Candidates ideally in
Austin, Travis County, Texas, 78716, USA
Listing for: EPAM Systems
Remote/Work from Home position
Listed on 2026-02-24
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Join our team as a Lead Site Reliability Engineer to drive system reliability, observability, and performance monitoring for mission-critical digital trading products.

You will lead monitoring initiatives in a high-availability trading environment, ensuring stable connectivity to external partners while proactively identifying opportunities for continuous improvement. At EPAM, you'll work on cutting-edge technologies, solve complex challenges, and shape the future of digital innovation. With access to continuous learning, mentorship, and global projects, your expertise will drive meaningful change.

Req#

Responsibilities
  • Define and implement a strategic reliability vision for the trading portfolio, covering infrastructure, network connectivity, application performance, and throughput

  • Lead and oversee a team of SRE engineers, providing technical direction, mentorship, and performance guidance

  • Own and evolve the SLA/SLO/SLI framework, including error budgets and service health reporting

  • Configure and optimize comprehensive monitoring and alerting systems across infrastructure and applications

  • Drive observability best practices using APM and monitoring platforms (e.g., Dynatrace)

  • Analyze application and infrastructure performance to isolate fault domains and determine root causes of critical incidents

  • Lead major incident management, coordinate resolution efforts, and conduct blameless postmortems

  • Participate in 24x7x365 support rotation and ensure operational excellence across the team

  • Identify automation opportunities to improve reliability, scalability, and operational efficiency

Requirements
  • 8+ years of experience in Site Reliability Engineering, Dev Ops, or Production Engineering

  • Proven leadership experience (technical lead or team lead), with ability to oversee and mentor engineers

  • Strong hands‑on experience with SLA/SLO/SLI definition, governance, and reporting

  • Solid experience working in Microsoft Azure environments (IaaS, PaaS, networking, monitoring)

  • Hands‑on experience with Dynatrace (configuration, alerting, dashboards, performance analysis)

  • Experience with observability, monitoring, and APM tools in production environments

  • Ability to operate effectively under pressure in time‑sensitive, high‑impact environments

We offer
  • Medical, Dental and Vision Insurance (Subsidized)

  • Health Savings Account

  • Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)

  • Short‑Term and Long‑Term Disability (Company Provided)

  • Life and AD&D Insurance (Company Provided)

  • Employee Assistance Program

  • Unlimited access to Linked In learning solutions

  • Matched 401(k) Retirement Savings Plan

  • Paid Time Off – the employee will be eligible to accrue 15‑25 paid days, depending on specific level and tenure with EPAM (accrual eligibility may change over time)

  • Paid Holidays - nine (9) total per year

  • Legal Plan and Identity Theft Protection

  • Accident Insurance

  • Employee Discounts

  • Pet Insurance

  • Employee Stock Purchase Program

  • If otherwise eligible, participation in the discretionary annual bonus program

  • If otherwise eligible and hired into a qualifying level, participation in the discretionary Long‑Term Incentive (LTI) Program

This Remote Position Cannot be Performed in New York City.

This posting includes a good faith range of the salary EPAM would reasonably expect to pay the selected candidate. The range provided reflects base salary only. Individual compensation offers within the range are based on a variety of factors, including, but not limited to: geographic location, experience, credentials, education, training; the demand for the role; and overall business and labor market considerations.

Most candidates are hired at a salary within the range disclosed. Salary range: $140,000 - $155,000. In addition, the details highlighted in this job posting above are a general description of all other expected benefits and compensation for the position.

In accordance with the LA County Fair Chance Ordinance, you may find a copy of the Notice containing a summary of the Ordinance’s key provisions here:
Concept FCO Posting 8 27 24 (lacounty.gov)

EPAM Systems, Inc. is an equal opportunity employer. We recognize the value of diversity and inclusion in creating…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary