Data Center Reliability Engineer Job Abilene area,Texas USA,Engineering

As a Reliability Engineer, you will apply data-driven analysis and engineering problem-solving to improve availability and reduce risk across mission-critical facility systems. You will identify failure patterns early, drive corrective actions, and build tooling and metrics that improve reliability s role manages ongoing critical environment maintenance by completing standard diagnostics and repairs and resolving issues. Manages incidents impacting services and conducts root cause analysis to mitigate recurrence and improve system resilience.

Conducts data center build site reviews and assessments in collaboration with other teams to evaluate suitability for data center builds. Supports and validates on-site data centers operations in relation to the electrical or mechanical infrastructure. Coordinates with internal and external project team members in delivering specific aspects of data centers or part-data centers for Oracle.

Key Responsibilities

Monitor and analyze operational telemetry, alarms, and performance trends to identify emerging risks and reliability degradation.
Define and track reliability KPIs; deliver concise analysis and recommendations that drive operational and engineering decisions.
Develop and maintain analytics and reporting tools using Python, SQL, and/or DCIM/BMS/SCADA data sources.
Support and/or lead RCAs and corrective action tracking for recurring or high-impact issues, ensuring follow-through and verification.
Partner with operations and engineering teams to improve preventive strategies, automation opportunities, and compliance execution.
Contribute to reliability standards and documentation that improve repeatability across sites.

Ideal Candidate Profile

Experience in reliability or systems analysis in data centers or other uptime-critical environments (utilities, telecom, manufacturing).
Engineering degree or equivalent applied experience; comfort with data and tooling is required for this to be real.

Skills and Competencies

Strong analytical and visualization skills; disciplined technical documentation.
Able to influence outcomes through evidence, clarity, and structured thinking.

Why Oracle Cloud Infrastructure?

Global impact at scale:
Contribute directly to how mission-critical OCI data centers operate across regions and continents, influencing infrastructure reliability, security, sustainability, and long-term capacity growth.
Technically rigorous environment:
Work alongside experienced engineers, automation specialists, and compliance teams in a rapidly scaling hyperscale cloud infrastructure, where disciplined execution and technical depth matter.
Culture built on operational excellence:
Join an organization that values safety, process rigor, clear accountability, and continuous improvement as foundational to protecting uptime and customer trust.
Long-term career development:
Benefit from internal mobility, role-based technical training, and development opportunities designed for professionals building long-term careers in cloud infrastructure and facilities operations.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language