Senior HPC Software Engineer
Listed on 2026-06-17
-
IT/Tech
Unix/Linux, SRE/Site Reliability
We are seeking a senior technical contributor to help support, modernize, and scale our on‑premise high‑performance computing platform. This role will work across Linux systems administration, HPC operations, Kubernetes‑based services, automation, observability, software tooling, and user‑facing platform delivery.
The ideal candidate has deep experience administering RHEL‑based systems in complex compute environments and is comfortable troubleshooting issues across operating systems, schedulers, storage, networking, containers, applications, and user workloads.
We are looking for someone who can balance strong technical depth with a user‑focused delivery mindset. This role requires ability to work collaboratively with platform engineers, application teams, and technical users to identify pain points, resolve production issues, document repeatable processes, and build durable improvements. The right candidate will be pragmatic, a team player, comfortable in a fast‑moving environment, and motivated by making complex, massive on‑prem infrastructure easier to operate, automate, observe, and continuously improve.
Responsibilities- Administer, troubleshoot, and improve RHEL‑based HPC environments supporting CPU and GPU workloads.
- Create and maintain HPC services across compute, storage, networking, scheduling, Kubernetes, and observability.
- Develop tools, scripts, APIs, integrations, and automation using Python, Go, Bash, or similar languages.
- Apply software engineering best practices, including Git workflows, code reviews, testing, modular design, and CI/CD.
- Support and help update HPC scheduling environments, with Slurm experience preferred.
- Improve monitoring, alerting, dashboards, and operational visibility using Grafana, Prometheus, Dynatrace, and related tools.
- Partner with users, customers, and internal engineering teams to understand requirements, resolve issues, and improve platform usability.
- Create and maintain documentation, architecture notes, user guides, and operational procedures.
- Drive platform modernization focused on reliability, scalability, automation, security, and maintainability.
- Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience.
- 10+ years of experience in systems engineering, infrastructure engineering, platform engineering, or a related technical role.
- Strong Linux systems administration experience, preferably with RHEL.
- Experience with Slurm, PBS, or another HPC workload manager.
- Experience creating APIs, applications, and services that support platform operations and user workflows.
- Experience supporting production compute, infrastructure, and large‑scale technical environments.
- Hands‑on experience with scripting and software development using Python, Go, Bash, or similar languages.
- Familiarity with CI/CD concepts, Git Hub, and modern software delivery practices.
- Strong troubleshooting skills across operating systems, services, networking, storage, and application layers.
- Ability to write clear documentation and communicate effectively with both technical and non‑technical stakeholders.
- Strong ownership mindset with the ability to drive issues to resolution.
- Ability to use independent judgement to make sound technical decisions.
- Immediate medical, dental, and prescription drug coverage.
- Flexible family care, parental leave, new parent ramp‑up programs, subsidized backup child care and more.
- Vehicle discount program for employees and family members, and management leases.
- Tuition assistance.
- Established and active employee resource groups.
- Paid time off for individual and team community service.
- A generous schedule of paid holidays, including the week between Christmas and New Year’s Day.
- Paid time off and the option to purchase additional vacation time.
For a detailed look at our benefits, see Benefit Summary ().
Salary:
Grade 8, ranging from $113,580–192,900.
Visa sponsorship is not provided for this role.
Candidates for positions with Ford Motor Company must be legally authorized to work in the United States. Verification of employment eligibility will be required at the time of hire.
We are an Equal Opportunity Employer committed to a culturally diverse workforce. All qualified applicants will receive consideration for employment without regard to race, religion, color, age, sex, national origin, sexual orientation, gender identity, disability status or protected veteran status. In the United States, if you need a reasonable accommodation for the online application process due to a disability, please call
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).