×
Register Here to Apply for Jobs or Post Jobs. X

Compiler Engineer - PyTorch + Kernel DSLPLATE

Job in San Jose, Santa Clara County, California, 95199, USA
Listing for: Conductor
Full Time position
Listed on 2026-06-17
Job specializations:
  • Software Development
    AI Engineer (Applied/Software), Software Engineer, Cloud Engineer - Software, DevOps
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Staff Compiler Engineer - PyTorch + Kernel DSLPLATE

To provide the best candidate experience amidst our high application volumes, each candidate is limited to 10 applications across all open jobs within a 6-month period.

Advancing the World’s Technology Together

Our technology solutions power the tools you use every day— including smartphones, electric vehicles, hyperscale data centers, IoT devices, and so much more. Here, you’ll have an opportunity to be part of a global leader whose innovative designs are pushing the boundaries of what’s possible and powering the future.

We believe innovation and growth are driven by an inclusive culture and a diverse workforce. We’re dedicated to empowering people to be their true selves. Together, we’re building a better tomorrow for our employees, customers, partners, and communities.

The AGI (Artificial General Intelligence) Computing Lab is dedicated to solving the complex system-level challenges posed by the growing demands of future AI/ML workloads. Our team is committed to designing and developing scalable platforms that can effectively handle the computational and memory requirements of these workloads while minimizing energy consumption and maximizing performance. To achieve this goal, we collaborate closely with both hardware and software engineers to identify and address the unique challenges posed by AI/ML workloads and to explore new computing abstractions that can provide a better balance between the hardware and software components of our systems.

Additionally, we continuously conduct research and development in emerging technologies and trends across memory, computing, interconnect, and AI/ML, ensuring that our platforms are always equipped to handle the most demanding workloads of the future. By working together as a dedicated and passionate team, we aim to revolutionize the way AI/ML applications are deployed and executed, ultimately contributing to the advancement of AGI in an affordable and sustainable manner.

Join us in our passion to shape the future of computing!

Location: Daily onsite presence at our San Jose, CA office / U.S. headquarters in alignment with our Flexible Work policy.

What You’ll Do
  • Adapting torch.compile to our backend: lowering Inductor's IR to our hardware, defining what gets fused, what gets specialized, and where the compiler should yield to hand‑written kernels.
  • Building or extending kernel DSLs for our hardware: taking a tile‑based programming model (Triton‑style), a higher‑level expression (Helion‑style), or a custom DSL we design, and lowering it to our ISA, our memory hierarchy, and our collective primitives.
  • Designing placement and scheduling passes: given a graph and our distributed memory model, deciding where tensors live, when to migrate them, and how to overlap compute with data movement.
  • Implementing parallelism‑aware lowering: making tensor, pipeline, expert, and sequence parallelism first‑class in the compiler IR rather than bolted on at the framework layer.
  • Fusion, tiling, and memory planning: the classical compiler problems, reframed for a non‑uniform memory hierarchy where the right tile size and the right placement are coupled decisions.
  • Upstream contributions: where we use open‑source DSLs, we want our work to land upstream rather than live in a private fork. You'll engage with upstream review processes for PyTorch, Triton, Helion, and adjacent projects.
What You Bring
  • Bachelor’s with 10+ years, or Master’s with 8+ years, or PhD's with 5+ years of industry experience.
  • 3‑5+ years of industry experience in at least one of:
    Triton, Helion, MLIR, XLA, TVM, Inductor, IREE, CUTLASS, or a proprietary equivalent (More experienced candidates will also be considered at relevant levels).
  • Experience designing a kernel DSL or its IR from scratch, or making non‑trivial language‑level changes to an existing one.
  • Experience with MLIR — writing dialects, passes, or backend integration.
  • Experience building PyTorch backends for non‑CUDA accelerators (XPU, ROCm, MPS, TPU, custom).
  • Experience with kernel autotuning, performance modeling, or cost‑based compilation.
  • Background in HPC, distributed systems, or NUMA‑aware programming — anything that built intuition for non‑flat memory.
  • Op…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary