HPC Performance Engineer
Listed on 2025-12-31
-
Software Development
Software Engineer, DevOps
Core Weave HPC Performance Engineer
Core Weave is seeking a highly skilled and motivated HPC Performance Engineer to join our HAVOCK Team, reporting into the Manager of Systems Engineering. In this role, you will play a crucial part in the design, development, and optimization of our bare‑metal systems from POST through joining a Kubernetes cluster. The team’s primary responsibilities include maintaining a custom Linux kernel, various OS images (Ubuntu‑based), the virtualization stack (kubevirt/qemu/vfio), and the container/pod runtime stack (containerd/nydus/kubelet).
You will collaborate closely with cross‑functional teams, up‑stack engineering teams, and stakeholders to ensure our low‑level software stack is performant in the context of hardware updates and to provide data, metrics, dashboards, and analysis to substantiate performance assertions.
- Develop and maintain tools for establishing systems performance baselines
- Develop and maintain performance regression analysis testing automation
- Design and maintain performance regression test pipelines for HPC workloads
- Debug and tune fabric‑level performance to ensure low‑latency, high‑throughput configurations
- Develop telemetry for performance analysis across distributed clusters of servers
- Triage and fix performance issues in Linux
- Collect data, produce metrics and visualizations that communicate performance information compared to benchmarks; use data to drive business decisions and automation improvements
- Define Linux and OS requirements, specifications, and system architecture in relation to system performance, in collaboration with cross‑functional teams
- Python, Go, bash/sh, C
- Prometheus, Victoria Metrics, Grafana
- Linux Kernel (custom build), Ubuntu
- Intel/AMD/ARM CPUs, Nvidia GPUs, DPUs, Infiniband and Ethernet NICs
- Docker, kubernetes (k8s), Kube Virt, containerd, kubelet
- 5+ years of professional experience in Systems/HPC Performance Engineering, Benchmarking, and/or Validation
- Strong experience with MPI workloads and distributed system performance analysis
- Familiarity with RoCE, Infini Band, GPUDirect/Data Direct I/O, NUMA, etc in HPC workloads
- Hands‑on use of public HPC benchmarks (HPCC, HPL, OSU, MLPerf‑HPC, STREAM, IO500)
- Extensive, deep experience in Linux internals
- Fluency with a programming language geared toward automation (Python preferred, but others possible)
- Experience writing robust, testable code
- Experience diagnosing and fixing systems performance issues
- Experience with implementing automation testing
- Ability to effectively prioritize and communicate proposed features and fixes in a remote‑employee environment
- Strong passion for automation, with a commitment to automating processes comprehensively
- Excellent documentation skills and attention to detail
- Strong analytical and problem‑solving abilities
- Familiarity with QA/QE best practices
- Familiarity with Golang
- Opinions about software version control and team collaboration
- Experience working in Cloud environments
- Experience as a software engineer writing large‑scale applications
- Experience in open‑source community software development
- Experience with machine learning is a huge bonus
We work hard, have fun, and move fast! We’re in an exciting stage of hyper‑growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:
- Be Curious at Your Core
- Act Like an Owner
- Empower Employees
- Deliver Best‑in‑Class Client Experiences
- Achieve More Together
We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for takeoff, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.
Come join us!
The base salary range for this role is $165,000 to $242,000. The starting salary will be determined based on…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).