×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability Engineer – Distributed Systems

Job in Arizona City, Pinal County, Arizona, 85223, USA
Listing for: Cognizant
Full Time position
Listed on 2026-01-02
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

About the role

As a Site Reliability Engineer, you will make an impact by designing and implementing observability solutions tailored for distributed edge computing environments. You will be a valued member of the Technology & Engineering team and work collaboratively with cross-functional teams to ensure system reliability, performance, and visibility across remote facilities.

In this role, you will
  • Design and implement observability frameworks for edge computing environments, including monitoring, logging, tracing, and metrics collection.
  • Define and maintain SLIs, SLOs, and business KPIs to measure and enhance system reliability across edge and centralized infrastructure.
  • Build dashboards, visualizations, and alerting systems for real-time insights and incident response.
  • Implement distributed tracing and log aggregation systems to troubleshoot complex edge issues.
  • Collaborate with engineering teams to embed observability best practices into edge applications and infrastructure.
  • Proactively identify issues using advanced observability tools, reducing MTTD and MTTR.
  • Lead incident postmortems and implement observability-driven improvements.
  • Develop automation scripts and tools to optimize observability pipelines for bandwidth-constrained environments.
  • Optimize data storage and querying strategies for performance, cost, and scalability.
  • Stay current with emerging observability trends and advocate for adoption of edge-specific solutions.
Work model

At Cognizant, we strive to provide flexibility wherever possible, and we are here to support a healthy work-life balance through our various wellbeing programs. Based on this role’s business requirements, this is an onsite position requiring 5 days a week in a client or Cognizant office.

Please note:

This role will require an in-person meet and greet at our Cognizant office or client location.

The working arrangements for this role are accurate as of the date of posting. This may change based on the project you’re engaged in, as well as business and client requirements. Rest assured; we will always be clear about role expectations.

What you need to have to be considered
  • 10+ years of IT experience
  • 3–5 years of experience in service reliability/operations for large-scale hybrid environments.
  • 3–5 years of experience writing automation scripts and building dashboards for application performance management.
  • 2–4 years of experience with programming languages such as Go, Python, Java, or Rust.
  • Working knowledge of databases such as Oracle, SQL Server, Redis, Click House, Postgre

    SQL, Mongo

    DB, or time-series databases.
  • At least 2 years of experience with cloud platforms and containerization (GCP, AWS, Rancher, Azure, Open Shift).
  • Experience maintaining containerized apps in GKE/RKE/AKE environments.
  • Experience implementing cloud observability using Open Telemetry (OTEL).
  • Experience with Graph

    QL frameworks (Apollo, Prisma, Hasura).
  • Strong understanding of networking protocols (TCP/IP, HTTP, DNS, load balancing, service mesh).
These will help you stand out
  • Proven experience managing application availability and building automation for high-availability platforms.
  • Hands-on experience with monitoring tools like Splunk, App Dynamics, Grafana/Prometheus, and Dynatrace.
  • Experience with CI/CD tools and extenders such as Rally and Confluence.
  • Experience with in-memory caching solutions (Redis preferred).
  • Strong debugging skills across integrated technical platforms and API gateways.
  • Hands-on experience with GCS, Cloud SQL, Spanner, and Firestore.
  • Experience in enterprise-level infrastructure and operations.
  • Expertise in high-availability and distributed systems, Linux/Windows administration, and support.
  • Experience monitoring and troubleshooting Hashi Corp Vault environments.
  • Working knowledge of Vertex AI, Gen AI, and Big Query.

Bachelor’s degree in computer science, IT or equivalent

Salary and Other Compensation

The annual salary for this position is depending on experience and other qualifications of the successful candidate.

This position is also eligible for Cognizant’s discretionary annual incentive program, based on performance and subject to the terms of Cognizant’s applicable plans.

Benefits:
Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:

  • Medical/Dental/Vision/Life Insurance
  • Paid holidays plus Paid Time Off
  • 401(k) plan and contributions
  • Long-term/Short-term Disability
  • Paid Parental Leave
  • Employee Stock Purchase Plan

Disclaimer:
The salary, other compensation, and benefits information is accurate as of the date of this posting. Cognizant reserves the right to modify this information at any time, subject to applicable law.

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary