×
Register Here to Apply for Jobs or Post Jobs. X

Reliability Analyst

Job in Fort Worth, Tarrant County, Texas, 76102, USA
Listing for: Optomi
Full Time position
Listed on 2026-05-30
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing, Cybersecurity
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below

Optomi, in partnership with one of our premier clients, is seeking an Reliability & Observability Analyst I to support 24/7 HPC data center operations. This role is ideal for early-career professionals looking to grow into Site Reliability Engineering, Infrastructure Operations, or Platform Engineering paths while gaining hands-on experience in observability, incident analysis, operational automation, and AIOps-enabled environments.

The ideal candidate will bring a strong reliability mindset, foundational Linux and networking knowledge, and experience working within production infrastructure or operations environments.

What the Right Candidate will Enjoy
  • Working within a cutting-edge HPC and data center operations environment powered by renewable energy!
  • Gaining hands-on exposure to observability, AIOps, reliability engineering, and operational automation!
  • Collaborating closely with IOC, infrastructure, and engineering teams in a highly technical environment!
  • Clear growth path into SRE, Platform Engineering, or Infrastructure Operations roles!
  • Exposure to enterprise observability tooling, incident analysis, and reliability initiatives!
Experience of the Right Candidate
  • 1–3 years of experience in IOC, NOC, technical operations, systems analysis, or SRE-adjacent environments.
  • Exposure to 24/7 production infrastructure, cloud, or data center operations environments.
  • Foundational understanding of SRE concepts including MTTR, MTTD, service health, and incident management lifecycle.
  • Working knowledge of Linux systems, networking fundamentals, and infrastructure dependencies.
  • Experience working with logs, metrics, dashboards, and alerting systems.
  • Familiarity with observability platforms such as Splunk, Datadog, Prometheus, or similar tools.
  • Understanding of alert quality analysis, event correlation, anomaly detection, and monitoring gap identification.
  • Ability to review automation artifacts such as Python, Bash, or configuration-based workflows.
  • Strong analytical, troubleshooting, and communication skills with attention to operational detail.
Responsibilities of the Right Candidate
  • Analyze incident data, operational signals, and system behaviors across infrastructure and data center environments.
  • Identify alerting gaps, false positives, delayed detections, and monitoring improvement opportunities.
  • Support continuous improvement initiatives for observability, reliability, and operational reporting.
  • Validate incident, ticketing, and operational data for accuracy and reporting integrity.
  • Review outputs from AIOps and automation platforms including anomaly detection and event correlation systems.
  • Assist with alert routing, enrichment, suppression testing, and observability automation efforts.
  • Produce SLA/KPI dashboards, reliability reporting, and operational insights for engineering and leadership teams.
  • Contribute to operational documentation, runbooks, and reliability-focused process improvements.
  • Partner cross-functionally with IOC, operations, and engineering teams to support platform stability and incident response.
  • Operate within established IOC processes while progressively developing deeper SRE and infrastructure operations expertise.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary