CPU RTL Engineer
Listed on 2026-06-18
-
Engineering
Hardware Engineer, Systems Engineer
About Velaura
Velaura is building the next generation of compute platforms for Physical AI.
As AI moves beyond the datacenter into robots, autonomous mobile systems, drones, and other embodied systems, traditional compute architectures are increasingly constrained by power, memory bandwidth, latency, real‑time requirements, and functional safety considerations.
Our mission is to develop the foundational compute technologies that enable intelligent systems to operate efficiently in the physical world.
We are assembling a team of exceptional architects and engineers to rethink how AI, sensing, memory, and control interact within a modern computing platform.
Role OverviewWe are looking for a CPU RTL Engineer to help build Velaura’s next‑generation Physical AI SoC.
In this role, you will work closely with architects, performance modelers, software engineers, verification engineers, and physical design teams to transform innovative architectural concepts into production silicon. You will own significant portions of the CPU and control‑processor design, drive microarchitectural decisions, and help deliver high‑performance, power‑efficient hardware that enables the next generation of intelligent physical systems.
Responsibilities- Own the design, implementation, and optimization of RTL for CPU cores, control processors, and associated subsystems of the Velaura SoC.
- Collaborate with architects to develop robust and efficient core microarchitectures — pipelines, branch prediction, load/store units, and cache interfaces — and drive microarchitectural decisions within your areas of ownership.
- Design for high‑speed operation: aggressive pipelining, critical‑path analysis, timing closure collaboration with physical design, and frequency‑aware microarchitecture.
- Design for low power: clock gating, power gating, DVFS support, retention strategies, and power‑aware microarchitectural tradeoffs.
- Participate in hardware/software co‑design discussions spanning AI workloads, runtime software, firmware, and system architecture.
- Work with software teams to define ISA usage, hardware interfaces, exception and interrupt models, and performance‑critical interactions.
- Optimize designs for performance, power, area, scalability, and reliability.
- Partner closely with verification and physical design teams throughout the development cycle.
- Analyze performance bottlenecks and propose architectural and implementation improvements.
- Leverage modern engineering tools, including AI‑assisted development workflows, to improve productivity, quality, and design exploration.
- Participate in design reviews and contribute to a culture of technical excellence.
- 5+ years (or equivalent depth) designing RTL for complex digital systems, ideally with ownership of meaningful blocks through tapeout.
- Strong understanding of CPU architecture and microarchitecture: pipelining, hazards, speculation, branch prediction, out‑of‑order or in‑order execution tradeoffs, and memory ordering.
- Experience with high‑speed design: timing‑driven RTL coding, critical‑path optimization, pipeline balancing, and achieving timing closure at high frequencies in advanced process nodes.
- Experience with low‑power design techniques: fine‑grained clock gating, power gating, multi‑voltage domains, DVFS, and UPF/CPF‑based power intent.
- Expert‑level Verilog/System Verilog and modern RTL design methodologies (lint, CDC, synthesis‑aware coding).
- Strong grasp of performance, power, and area tradeoffs, and experience making data‑driven microarchitecture decisions.
- Experience with hardware/software co‑design and system‑level performance optimization.
- Strong debugging and problem‑solving skills.
- Ability to work effectively in a collaborative, multidisciplinary engineering environment.
- Hands‑on experience designing, extending, or integrating RISC‑V cores — including the RISC‑V ISA, standard extensions (e.g., vector, bit‑manipulation, hypervisor), privilege modes, and the surrounding ecosystem.
- Experience with RISC‑V‑specific infrastructure: PLIC/CLIC/AIA interrupt architectures, debug spec, trace, and platform‑level interoperability.
- Familiarity with cache and…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).