×
Register Here to Apply for Jobs or Post Jobs. X

Software Developer, AI Engineer, Machine Learning​/ ML Engineer

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: Ll Oefentherapie
Full Time position
Listed on 2026-02-23
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 120000 - 160000 USD Yearly USD 120000.00 160000.00 YEAR
Job Description & How to Apply Below
Position: Software Developer 5

OCI is driving the development of next-generation hyperscalar GPU data centers built on Nvidia and AMD GPUs. OCI enables popular AI services such as OpenAI on GPU compute servers. We are looking for engineers experienced in working with GPU device drivers and runtime libraries (CUDA and ROCM).

You must understand GPU architectural concepts such as UVM, host-to-device, and device-to-host interactions, including the ability to quantify performance issues in these interactions. We seek candidates with strong experience in building and debugging issues in GPU drivers and Linux kernels that interact with the GPU stack, including functional and performance issues when running GPU AI/ML/inference workloads.

The candidate should be proficient in using standard performance and stress testing tools such as DCGM, NCCL, and RCCL suites. Additionally, experience in debugging and diagnosing system issues reported via RAS events, GPU BMC, and other monitoring agents is required. Candidates should have broad knowledge of BIOS, CPU, and GPU BMC, and demonstrate strong proficiency in C programming and working knowledge of Python or other scripting languages used in AI/GPU environments.

Responsibilities include engaging in debugging during new product bring-up and customer workloads, collaborating with GPU vendors, and handling OCI data center escalations. Candidates should be comfortable with CI/CD pipelines, building and customizing drivers for Oracle Linux and Ubuntu, unit testing, and validating GPU performance with standard benchmarks. A good understanding of the entire boot process, including BIOS and BMC touchpoints, is essential.

Strong technical and communication skills are required to collaborate with hardware and firmware teams to drive OCI's success.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary