Site Reliability Engineer; SRE - Observability Specialist at Vodastra Las Vegas, NV
Job in
Las Vegas, Clark County, Nevada, 89105, USA
Listed on 2026-05-28
Listing for:
Downtown Boulder Partnership
Seasonal/Temporary
position Listed on 2026-05-28
Job specializations:
-
IT/Tech
Systems Engineer, SRE/Site Reliability, IT Support, Cloud Computing
Job Description & How to Apply Below
Job Description
Site Reliability Engineer (SRE) - Observability Specialist
Location:
Las Vegas, NV 89101 (Onsite) Position Type:
Contract
We are seeking a skilled and passionate Site Reliability Engineer (SRE) with a strong focus on Observability to join our onsite team. In this role, you will design, implement, and maintain observability solutions to ensure the reliability, scalability, and performance of our systems. As an Observability Specialist, you will collaborate with development, operations, and business teams to drive improvements in system monitoring, logging, tracing, and alerting.
Key Responsibilities- Design and implement observability solutions, including monitoring, logging, and distributed tracing, to provide actionable insights into system behavior and health.
- Evaluate and integrate observability tools and platforms (e.g., Prometheus, Grafana, Elasticsearch, Datadog, New Relic).
- Define and maintain key performance indicators (KPIs) and service level objectives (SLOs) to measure system reliability and performance.
- Develop robust alerting systems that minimize noise and provide meaningful, actionable alerts for critical issues.
- Proactively identify system reliability risks through observability metrics and collaborate with teams to implement mitigation strategies.
- Participate in root cause analysis (RCA) and implement solutions to prevent the recurrence of incidents.
- Work closely with development and Dev Ops teams to embed observability best practices into the software delivery lifecycle.
- Act as a champion for observability, educating teams on its importance and guiding them in its adoption.
- Automate repetitive observability tasks, such as dashboard creation, log parsing, and alert tuning.
- Optimize monitoring systems to reduce overhead and enhance efficiency.
- Create and maintain documentation for observability processes, tools, and integrations.
- Develop dashboards and reports to provide visibility into system health and reliability for stakeholders.
Education
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
- Proven experience in Site Reliability Engineering, Dev Ops, or a similar role.
- Extensive hands‑on experience with observability tools and platforms (e.g., Prometheus, Grafana, Splunk, Elastic Stack, Open Telemetry).
- Experience with cloud platforms (AWS, Azure, GCP) and container orchestration systems (Kubernetes, Docker).
- Proficiency in programming and scripting languages (e.g., Python, Go, Bash).
- Strong understanding of distributed systems, microservices architecture, and networking.
- Expertise in designing monitoring systems with KPIs, SLOs, and SLIs.
- Experience with incident response, postmortem analysis, and reliability reporting.
- Certifications in cloud platforms or observability tools.
- Familiarity with chaos engineering principles and practices.
- Hands‑on experience with Infrastructure-as-Code (e.g., Terraform, Ansible).
- Analytical mindset with strong problem‑solving skills.
- Effective communication and collaboration abilities.
- Proactive and detail‑oriented with a passion for reliability and automation.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×