Research Engineer: Spatial Perception and Reasoning
Job in
San Jose, Santa Clara County, California, 95199, USA
Listed on 2026-06-21
Listing for:
Honda Research Institute USA, Inc.
Full Time
position Listed on 2026-06-21
Job specializations:
-
Engineering
Artificial Intelligence, AI Evaluation, AI Engineer (Applied/Software), Robotics
Job Description & How to Apply Below
Job Number: P25 F03
Honda Research Institute USA (HRI-US) is seeking a Research Engineer to advance multimodal world modeling and spatial intelligence for real-world AI systems. This is a hands‑on engineering role focused on developing and scaling robust learning systems that understand dynamic 3D environments, integrate vision, language, and temporal reasoning, and support predictive and adaptive behavior in embodied AI systems. The successful candidate will contribute to methods for spatial perception, geometric reasoning, scene understanding, video‑based reasoning, and multimodal representation learning in complex real-world environments.
San Jose, CA
Key Responsibilities- Develop models for 3D spatial perception, geometric reasoning, and dynamic scene understanding in real‑world environments.
- Design and prototype multimodal learning systems that integrate vision, language, video, and temporal signals for spatial reasoning.
- Build robust scene understanding systems capable of handling long‑tail, ambiguous, and edge‑case scenarios using large‑scale data, simulation, or generative approaches.
- Develop world‑modeling and predictive‑reasoning methods, including learning‑based dynamics models, video prediction, and imagination‑driven planning.
- Investigate multimodal representation learning approaches that align spatial, visual, linguistic, and temporal information.
- Train, fine‑tune, evaluate, and optimize multimodal models such as VLMs, MLLMs, video‑language models, or related architectures.
- Conduct benchmarking, error analysis, and experimental evaluations to improve robustness, generalization, and real‑world performance.
- Collaborate with research teams to develop prototypes, publications, patents, and technical innovations.
- Master’s degree or Ph.D. in Computer Science, Electrical Engineering, Robotics, Machine Learning, or a related field.
- Strong experience designing and developing multimodal models, including VLMs, MLLMs, video‑language models, or related architectures.
- Solid foundation in 3D spatial perception and geometric reasoning, with the ability to model spatial relationships in dynamic environments.
- Experience building robust scene understanding systems for real‑world or simulated environments.
- Familiarity with world models or predictive reasoning methods, such as learning‑based dynamics models, video prediction, or planning‑oriented representations.
- Experience with multimodal representation learning that integrates vision, language, and temporal signals.
- Proficiency in Python and modern deep learning frameworks, with the ability to rapidly prototype research ideas and systems.
- Strong communication, presentation, and collaboration skills.
- Experience with vision‑text embedding alignment, vision and language encoders, adapters, MLLM training stages, or multimodal fine‑tuning pipelines.
- Knowledge of generative models such as Variational Autoencoders, Diffusion Models, or Generative Adversarial Networks.
- Experience with action understanding tasks such as action segmentation, temporal alignment, action anticipation, or activity recognition.
- Publications or research contributions in leading AI, machine learning, computer vision, or robotics venues such as CVPR, ICCV, ECCV, NeurIPS, ICLR, AAAI, RSS, CoRL, or ICRA.
Desired
Start Date:
9/14/2026
Position
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×