Kernel Optimization Engineer
Listed on 2026-02-08
-
Software Development
AI Engineer, Software Engineer, Machine Learning/ ML Engineer, Embedded Software Engineer
Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach enables Cerebras to deliver industry-leading training and inference speeds for large-scale ML applications.
Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services.
AboutThe Role
As a Kernel Engineer on our team, you will develop high-performance software solutions at the intersection of hardware and software, focusing on implementing, optimizing, and scaling deep learning operations to leverage our custom, massively parallel processor architecture. You will be part of a world-class team responsible for the design, performance tuning, and validation of foundational ML and HPC kernels, building a library of parallel and distributed algorithms to maximize compute utilization and push the boundaries of training efficiency for state-of-the-art AI models.
Your work will unlock the full potential of our hardware and accelerate AI innovation.
- Develop design specifications for new machine learning and linear algebra kernels and mapping to the Cerebras WSE System using various parallel programming algorithms.
- Develop and debug a kernel library of highly optimized low level assembly instructions and C-like domain specific language routines to implement algorithms targeting the Cerebras hardware system.
- Develop and debug high-performance kernel routines in low-level assembly and a custom C-like (CSL) language, implementing algorithms optimized for the Cerebras hardware system.
- Use mathematical models and analysis to measure software performance and inform design decisions.
- Develop and integrate unit and system testing methodologies to verify correct functionality and performance of kernel libraries.
- Study emerging trends in machine learning applications and help evolve kernel library architecture to address computational challenges of state-of-the-art neural networks.
- Interact with chip and system architects to optimize instruction sets, microarchitecture, and IO of next-generation systems.
- Bachelor’s, Master’s, PhD or foreign equivalents in Computer Science, Computer Engineering, Mathematics, or related fields.
- Understanding of hardware architecture concepts and the ability to learn the details of a new hardware architecture.
- Skilled in C++ and Python programming languages.
- Good knowledge of library and/or API development best practices.
- Strong debugging skills and experience debugging complex software stacks.
- Experience in kernel development and/or testing.
- Familiarity with parallel algorithms and distributed memory systems.
- Experience programming accelerators such as GPUs and FPGAs.
- Familiarity with machine learning neural networks and frameworks such as Tensor Flow and PyTorch.
- Familiarity with HPC kernels and their optimization.
People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that unlocks new opportunities for the AI industry. With rapid growth and model releases, Cerebras offers opportunities to contribute to cutting-edge AI research and one of the fastest AI systems in the world.
- Build a breakthrough AI platform beyond the constraints of GPUs.
- Publish and open source cutting-edge AI research.
- Work on a fast AI supercomputer.
- Enjoy job stability with startup vitality.
- Engage in a simple, respectful work culture that values individual beliefs.
Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).