×
Register Here to Apply for Jobs or Post Jobs. X

Kernel Engineer

Job in Mountain View, Santa Clara County, California, 94039, USA
Listing for: Acceler8 Talent
Full Time position
Listed on 2026-01-11
Job specializations:
  • IT/Tech
    Hardware Engineer, AI Engineer
Job Description & How to Apply Below

This range is provided by Acceler8 Talent. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base pay range

$/yr - $/yr

Additional compensation types

Stock options

Kernel Engineer – AI Hardware - Hybrid | Mountain View, CA

Acceler8 Talent is seeking an experienced Kernel Engineer to join a well-funded startup based out of Mountain View whose hardware promises to drastically change the economics of AI compute for the largest and most demanding models.

Founded by engineers behind some of the industry’s most successful semiconductor and AI platforms, this company is building a next-generation hardware-software stack designed to push the limits of performance and efficiency for large-scale AI workloads.

As a Kernel Engineer, you will be responsible for designing and optimizing performance-critical kernels that interface directly with custom AI hardware. You will work closely with ML Research and Hardware Engineering teams, providing a programmer’s perspective on hardware architecture and ensuring tight integration across the software stack.

Responsibilities
  • Design, implement, and optimize high-performance kernels that interface directly with custom AI hardware.
  • Partner closely with ML Research and Hardware Engineering teams to translate algorithmic intent into efficient kernel implementations.
  • Provide architectural feedback and guidance from a programmer’s perspective to influence hardware and system design decisions.
  • Optimize kernels using techniques such as parallelism, SIMD/vectorization, low-level memory optimization, and instruction-level tuning.
  • Support performance analysis, profiling, and debugging across kernels, runtime, and hardware.
Requirements
  • Bachelor’s degree in Computer Science or equivalent practical experience.
  • Experience optimizing software for specialized or accelerator hardware, including techniques such as parallel programming, SIMD, low-level C/C++, assembly-level optimization, or GPU/CUDA programming.
  • Proficiency in at least one of:
    Assembly, C, C++, Zig, or Rust.
  • Strong understanding of performance bottlenecks across compute, memory, and data movement.
Preferences
  • Experience implementing kernels for ML workloads, including models such as Transformers.
  • Familiarity with distributed and parallel execution models, including All Reduce, All To All , data parallelism, and tensor parallelism.
  • Working knowledge of compiler fundamentals and how code is lowered, optimized, and executed on modern hardware.

If you're interested in building the future of AI compute, apply here or reach out to me at to discuss further.

Seniority level

Mid‑Senior level

Employment type

Full‑time

Job function

Engineering and Science

Industries

Semiconductor Manufacturing, Computers and Electronics Manufacturing, and Computer Hardware Manufacturing

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary