Firmware Lifecycle Specialist
Listed on 2026-06-25
-
IT/Tech
IT Infrastructure, Systems Engineer, SRE/Site Reliability
Job Type: Full-Time |
Location:
San Francisco | Department:
Network | Reporting to:
Associate Director, Network | Work Location Type: #hybrid
IREN is a leading AI Cloud Service Provider, delivering large-scale GPU clusters for AI training and inference. IREN’s vertically integrated platform is underpinned by its expansive portfolio of grid-connected land and data centers in renewable-rich regions across the U.S. and Canada.
The Firmware Lifecycle Specialist will lead and scale firmware deployment operations across NVIDIA OEM server platforms within high-performance computing (HPC) and data centre environments. This role is responsible for defining and executing firmware lifecycle strategy, ensuring firmware packages are validated, deployed, and maintained with high quality, reliability, and operational efficiency.
As a key technical and operational leader, the Firmware Lifecycle Specialist will drive continuous improvement in firmware deployment automation, standardization, and operational performance across global infrastructure environments. The role partners closely with Firmware Engineering, Platform Engineering, OEM vendors, Operations, and Infrastructure teams to ensure firmware operations support scalable, resilient, and high-performing compute infrastructure.
With 100% renewable energy, we build, own and operate our data centers and take pride in being at the forefront of sustainable solutions for the ever-evolving applications of high-performance compute. We believe that human progress is invaluable, but it should be done in the right way – responsibly, sustainably and having a positive impact on the communities we operate in.
Job Requirements- 8–12+ years of experience in firmware deployment, infrastructure operations, platform engineering, or systems lifecycle management within large-scale infrastructure environments.
- Strong experience managing firmware lifecycle operations across NVIDIA OEM platforms, servers, or enterprise hardware environments.
- Deep understanding of firmware deployment methodologies, validation processes, automation frameworks, and operational best practices.
- Experience working within HPC, AI infrastructure, data centre, or large-scale distributed compute environments.
- Demonstrated experience improving deployment automation, operational efficiency, and deployment quality metrics at scale.
- Strong understanding of server hardware platforms, infrastructure operations, and OEM ecosystem management.
- Experience collaborating cross-functionally across engineering, operations, infrastructure, and vendor organisations.
- Strong analytical, troubleshooting, and operational problem‑solving capabilities.
- Excellent communication and stakeholder management skills, with the ability to operate effectively across technical and operational teams.
- Experience with infrastructure automation, orchestration, and deployment tooling is highly desirable.
- Define and lead the global firmware deployment strategy across NVIDIA OEM platforms and associated infrastructure environments.
- Own end‑to‑end firmware lifecycle management, including validation, testing, deployment, maintenance, rollback planning, and upgrade governance.
- Ensure firmware deployment processes meet high standards for reliability, consistency, scalability, and operational readiness.
- Establish deployment standards, operating procedures, and governance frameworks across regions and environments.
- Drive alignment between firmware lifecycle operations and broader infrastructure reliability and performance objectives.
- Own operational KPIs for firmware deployment performance, including deployment velocity, installation success rates, deployment quality, and operational stability.
- Lead initiatives to reduce firmware deployment times and improve deployment efficiency across large-scale global infrastructure environments.
- Monitor deployment outcomes, identify operational bottlenecks, and implement corrective actions to improve reliability and consistency.
- Ensure robust reporting and visibility into firmware deployment performance, operational…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).