Quality & Risk Manager
Listed on 2026-02-16
-
IT/Tech
Systems Engineer
Nscale is the GPU cloud engineered for AI. We deliver cost‑effective, high‑performance infrastructure that enables AI‑first companies to scale rapidly while reducing complexity across design, build, and operations. Our platform supports strategic business outcomes across performance, cost efficiency, and sustainability.
We operate with a culture of ownership, accountability, and relentless improvement. As an Nscaler, you’ll work alongside high‑performing teams to build the systems, processes, and infrastructure that power the future of AI at scale.
Role OverviewThe Quality & Risk Manager is responsible for establishing and operating Nscale’s risk and quality assurance framework across all enterprise and business‑as‑usual (BAU) deployments. This role provides independent oversight, structured challenge, and practical support to ensure that complex GPU and data‑centre programmes are delivered safely, predictably, and to defined standards.
The role sits at the intersection of programme delivery, engineering, and governance. The successful candidate will bring strong risk management discipline, an eye for quality, and the credibility to work with senior PMs, architects, and engineers while escalating issues decisively when required.
Key Responsibilities- Own and continuously improve the enterprise risk management framework across all GPU and data‑centre deployments.
- Define and maintain standardised risk processes, templates, thresholds, and escalation paths for Enterprise and BAU programmes.
- Facilitate structured risk identification workshops and ensure every programme maintains a live risk register with clear ownership and mitigations.
- Maintain a consolidated, portfolio‑level view of risks across all deployments, including capacity, supply chain, power, fabric, multi‑DC dependencies, and regulatory considerations.
- Identify cross‑programme and systemic risks and elevate critical items to senior leadership.
- Support leadership decision‑making by providing clear risk insights, mitigation options, and contingency scenarios.
- Define and own quality standards covering design, build, testing, commissioning, and handover.
- Establish checklists, acceptance criteria, and documentation requirements aligned to Nscale’s reference architectures (e.g. 10K / 20K / 50K GPU designs).
- Plan and conduct independent assurance reviews at key delivery gates, including design freeze, pre‑build, pre‑go‑live, and post‑implementation.
- Coach project teams on the practical application of risk and quality standards.
- Support PMs in structuring effective risk registers and mitigation plans.
- Lead or support root‑cause analysis when issues occur, ensuring corrective and preventive actions are clearly defined.
- Ensure lessons learned are systematically captured and fed back into standards, playbooks, and reference designs.
- Monitor adherence to defined methods, standards, and risk processes across all programmes.
- Track exceptions, trends, and recurring issues and report them to the Head of Enterprise Deployments.
- Define, track, and report KPIs related to risk and quality performance (e.g. unmanaged high risks, assurance findings, defect rates, audit outcomes).
- Risk Management Framework & Playbook – documented processes, RACI, templates (risk registers, impact/likelihood matrices, escalation paths) tailored to GPU and data‑centre deployments.
- Programme & Portfolio Risk Register – up‑to‑date risk logs for each major deployment and a consolidated portfolio‑level risk dashboard for leadership review.
- Quality Management Plan & Standards – defined quality criteria for designs, engineering work packs, test plans, and handover documentation, aligned to Nscale reference architectures.
- Assurance Review Reports – formal outputs from design reviews, readiness reviews, and post‑implementation reviews, including findings, risk ratings, and agreed actions.
- Compliance & Audit Evidence – clear evidence of alignment with internal policies, safety and regulatory requirements, and any external standards adopted by Nscale.
- Lessons Learned & Continuous…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: