Safe RL Control Engineer
Listed on 2026-02-07
-
Engineering
Robotics, Systems Engineer
Overview
Humanoid is the first AI and robotics company in the UK, creating the world’s most advanced, reliable, commercially scalable, and safe humanoid robots. Our first humanoid robot HMND 01 is a next-gen labour automation unit, providing highly efficient services across various use cases, starting with industrial applications.
MissionAt Humanoid we strive to create the world’s leading, commercially scalable, safe, and advanced humanoid robots that seamlessly integrate into daily life and amplify human capacity.
VisionIn a world where artificial intelligence opens up new horizons, our faith in its potential unveils a new outlook where, together, humans and machines build a new future filled with knowledge, inspiration, and incredible discoveries. The development of a functional humanoid robot underpins an era of abundance and well-being where poverty will disappear, and people will be able to choose what they want to do.
We believe that providing a universal basic income will eventually be a true evolution of our civilization.
As the demands on our built environment rise, labour shortages loom. With the world’s workforce increasingly moving away from undesirable tasks, the manufacturing, construction, and logistics industries critical to our daily lives are left exposed. By deploying our general-purpose humanoid robots in environments deemed hazardous or monotonous, we envision a future where human well-being is safeguarded while closing the gaps in critical global labour needs.
We are looking for an exceptional Senior or Staff Control Engineer to join our Control Team in Boston.
You will be a key contributor to the development and evolution of our whole-body control (WBC) software stack - the layer that unifies locomotion, manipulation, and interaction control for our robotic systems.
The ideal candidate combines a strong background in classical control with the ability to develop and integrate reinforcement-learning-based control components into complex, real-time systems. You will work at the intersection of robot dynamics, control architecture, and modern learning-driven control, collaborating closely with engineers in London and Vancouver who share responsibility for our global control infrastructure.
A key focus of this role will be ensuring safety and robustness in loco-manipulation behaviors of bipedal robots - designing control strategies that guarantee safe, stable, predictable, and recoverable interaction between locomotion and manipulation subsystems in dynamic environments.
This is a hands-on, system-defining role for someone passionate about high-performance robotic control - from model-based design to the deployment of advanced control strategies that bring robots to life.
What You’ll DoWhole-Body Control Architecture:
- Design, implement, and extend whole-body control frameworks that coordinate multiple robot subsystems (locomotion, manipulation, teleoperation).
- Develop and maintain mid-level controllers that translate motion objectives into coherent, stable, real-time control actions.
- Ensure controllers are modular, deterministic, and extensible, supporting both classical and learning-based control strategies.
- Architect and tune low-level controllers for balanced performance, supporting compliant behaviors for learning tasks and precise fallback modes for safety.
- Develop and enforce safety mechanisms within WBC to manage contact, stability, and recovery during combined locomotion and manipulation (loco-manipulation) behaviors.
- Develop and integrate RL-based controllers and policies within the WBC architecture.
- Define clear, robust interfaces between classical controllers and learned components, enabling smooth blending and fallback behaviors.
- Collaborate with the Imitation Learning and Deployment teams to ensure compatibility of runtime systems and deployment pipelines - while maintaining full ownership of control and WBC components.
- Shape RL action spaces to promote safe exploration, avoiding extreme behaviors while enabling smooth policy execution.
- Work with deployment teams to align RL outputs with hardware realities, using simulation penalties and transfer techniques for…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).