×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer; SRE - Observability Specialist at Vodastra Las Vegas, NV

Job in Las Vegas, Clark County, Nevada, 89105, USA
Listing for: Downtown Boulder Partnership
Seasonal/Temporary position
Listed on 2026-05-28
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, IT Support, Cloud Computing
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Position: Site Reliability Engineer (SRE) - Observability Specialist at Vodastra Las Vegas, NV

Job Description

Site Reliability Engineer (SRE) - Observability Specialist

Location:

Las Vegas, NV 89101 (Onsite) Position Type:
Contract

Job Summary

We are seeking a skilled and passionate Site Reliability Engineer (SRE) with a strong focus on Observability to join our onsite team. In this role, you will design, implement, and maintain observability solutions to ensure the reliability, scalability, and performance of our systems. As an Observability Specialist, you will collaborate with development, operations, and business teams to drive improvements in system monitoring, logging, tracing, and alerting.

Key Responsibilities
  • Observability Architecture & Implementation
    • Design and implement observability solutions, including monitoring, logging, and distributed tracing, to provide actionable insights into system behavior and health.
    • Evaluate and integrate observability tools and platforms (e.g., Prometheus, Grafana, Elasticsearch, Datadog, New Relic).
  • Monitoring & Alerting
    • Define and maintain key performance indicators (KPIs) and service level objectives (SLOs) to measure system reliability and performance.
    • Develop robust alerting systems that minimize noise and provide meaningful, actionable alerts for critical issues.
  • System Reliability Engineering
    • Proactively identify system reliability risks through observability metrics and collaborate with teams to implement mitigation strategies.
    • Participate in root cause analysis (RCA) and implement solutions to prevent the recurrence of incidents.
  • Collaboration & Advocacy
    • Work closely with development and Dev Ops teams to embed observability best practices into the software delivery lifecycle.
    • Act as a champion for observability, educating teams on its importance and guiding them in its adoption.
  • Automation & Optimization
    • Automate repetitive observability tasks, such as dashboard creation, log parsing, and alert tuning.
    • Optimize monitoring systems to reduce overhead and enhance efficiency.
  • Documentation & Reporting
    • Create and maintain documentation for observability processes, tools, and integrations.
    • Develop dashboards and reports to provide visibility into system health and reliability for stakeholders.
  • Qualifications
    Education
    • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
    Experience
    • Proven experience in Site Reliability Engineering, Dev Ops, or a similar role.
    • Extensive hands‑on experience with observability tools and platforms (e.g., Prometheus, Grafana, Splunk, Elastic Stack, Open Telemetry).
    • Experience with cloud platforms (AWS, Azure, GCP) and container orchestration systems (Kubernetes, Docker).
    Skills
    • Proficiency in programming and scripting languages (e.g., Python, Go, Bash).
    • Strong understanding of distributed systems, microservices architecture, and networking.
    • Expertise in designing monitoring systems with KPIs, SLOs, and SLIs.
    • Experience with incident response, postmortem analysis, and reliability reporting.
    Preferred Qualifications
    • Certifications in cloud platforms or observability tools.
    • Familiarity with chaos engineering principles and practices.
    • Hands‑on experience with Infrastructure-as-Code (e.g., Terraform, Ansible).
    Key Competencies
    • Analytical mindset with strong problem‑solving skills.
    • Effective communication and collaboration abilities.
    • Proactive and detail‑oriented with a passion for reliability and automation.
    #J-18808-Ljbffr
    To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
    (If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
     
     
     
    Search for further Jobs Here:
    (Try combinations for better Results! Or enter less keywords for broader Results)
    Location
    Increase/decrease your Search Radius (miles)
    0
    200
    Filters
    Education Level
    Experience Level (years)
    Posted in last:
    Salary