×
Register Here to Apply for Jobs or Post Jobs. X

Data Center Operations Engineer

Job in Santa Fe, Santa Fe County, New Mexico, 87503, USA
Listing for: Cadence Design Systems
Full Time position
Listed on 2025-12-18
Job specializations:
  • IT/Tech
    Systems Engineer, IT Support
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Data Center Operations Engineer page is loaded## Data Center Operations Engineer locations:
Santa Fe, New Mexico time type:
Full time posted on:
Posted Todayjob requisition :
R51880##
** At Cadence, we hire and develop leaders and innovators who want to make an impact on the world of technology.
**** Job Summary
** The Data Center Operations Engineer is responsible for supporting, maintaining, and deploying critical data center infrastructure with a strong focus on
** Linux-based systems, GPU server deployments, and Infini Band networking**. This role requires hands-on expertise in data center operations, cluster bring-up, hardware installation, and troubleshooting across compute, network, and GPU environments. The engineer will collaborate closely with global infrastructure, development, and operations teams to ensure reliable, secure, and scalable service delivery.
** Key Responsibilities
*** Provide hands-on operational support for all data center projects, deployments, and repair activities.
* Participate in an on-call rotation and provide on-site or remote support during maintenance windows and incidents.
* Troubleshoot and resolve operational issues related to Linux servers, GPU platforms, networking, and storage infrastructure.
* Support customer and internal deployments, ensuring timely and successful bring-up of GPU servers and clusters.
* Perform Infini Band fabric bring-up, switch configuration, subnet management, and troubleshooting.
* Conduct daily health checks of Linux systems and infrastructure components, proactively identifying and mitigating risks.
* Install, configure, test, and maintain server hardware (rack and stack, labeling, HDDs, memory, CPUs, RAID batteries, NICs, etc.).
* Install, configure, and troubleshoot networking equipment including routers, switches, and terminal servers for out-of-band management.
* Review and validate equipment deployments against approved design documentation and standards.
* Support data center builds, refreshes, migrations, and expansions while adhering to quality and safety standards.
* Coordinate with vendors and onsite staff for hardware delivery, diagnostics, replacement, and warranty services.
* Utilize monitoring and alerting frameworks to identify issues, escalate appropriately, and ensure timely service restoration.
* Maintain accurate documentation of operational procedures, system configurations, and runbooks.
* Follow established incident management, escalation procedures, and service-level agreements (SLAs).
* Collaborate with global teams across time zones to support operational initiatives and continuous improvement efforts.
* Contribute to process improvement initiatives and ensure adherence to documented policies, processes, and procedures.
** Required Qualifications
*** Bachelor’s degree in Computer Science, Engineering, Information Technology, or equivalent practical experience.
* ** Strong hands-on experience in Linux environments**, including system administration, troubleshooting, and performance validation.
* ** Proficiency with Linux command-line tools and shell scripting** (Bash or equivalent).
* Experience with cluster bring-up, driver installation, and system-level configuration.
* Hands-on experience setting up and validating GPU servers in clustered environments.
* Experience with end-to-end GPU testing in Infini Band-based clusters.
* Working knowledge of Infini Band networking, including switch configuration and subnet management.
* Solid understanding of networking fundamentals, including the OSI model and TCP/IP protocol suite (IP, ARP, ICMP, TCP, UDP, SMTP, FTP, TFTP).
* Experience installing, configuring, and troubleshooting routers, switches, and terminal servers.
* Familiarity with fiber and copper cabling, including IP and SAN deployments.
* Experience managing incident tickets, maintaining acceptable ticket loads, and meeting SLAs.
* Strong organizational skills with meticulous attention to detail in data center environments.
* Ability to follow and enforce documented escalation procedures and operational policies.
* Strong verbal and written communication skills, with the ability to collaborate effectively with cross-functional and global teams.
** Preferred Qualifications
*** Experience supporting HPC, AI, or large-scale GPU environments.
* Exposure to data center monitoring
* Experience documenting operational processes and maintaining technical runbooks.
* Familiarity with large-scale data center buildouts or refresh programs.
** Physical Requirements
*** Ability to perform the essential functions of the role, including lifting, moving, and installing equipment weighing 50 pounds or more, with or without reasonable accommodation.
* Ability to work in data center environments, including raised floors, equipment racks, and confined spaces.
* Willingness to work flexible hours, including nights, weekends, and on-call rotations as required.
** Work Environment
*** On-site data center environment with occasional remote coordination.
* Interaction with hardware…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary