Senior Research Scientist- Vision-Language-Action; VLA Models Job Sunnyvale area,California USA,Engineering

Position: Senior Research Scientist- Vision-Language-Action (VLA) Models
Company Description

The Bosch Research and Technology Center North America with offices in Sunnyvale, California, Pittsburgh, Pennsylvania, and Cambridge, Massachusetts is a part of the global Bosch Group (), a company with over 70 billion euro revenue, 400,000 employees worldwide, a very diverse product portfolio, and a history spanning over 125 years. The Research and Technology Center North America (RTC-NA) is dedicated to providing technologies and system solutions for various Bosch business fields, primarily in the field of artificial intelligence, energy technologies, internet technologies, circuit design, semiconductors and wireless, as well as advanced MEMS design.

As a part of the global research, our AI research in Silicon Valley focuses on Foundation Models, Big Data Visual Analytics, Explainable AI (XAI), Natural Language Processing, Computer Vision & Mixed Reality, Cloud Robotics, Data Science, AI System Engineering, Time-series Analysis. We develop scalable, intelligent, and trustworthy AIoT solutions for Bosch products and services in application areas such as automated driving, advanced driver assistance systems (ADAS), robotics, smart manufacturing, enterprise AI, health care, smart home and building solutions.

Originating from the AI research in Silicon Valley, our Intelligent Autonomous Systems group is responsible for enabling future autonomous Bosch products by pushing the boundaries of automated driving, advanced driver assistance systems (ADAS), robotics and automation through key innovations that encompass system architecture and AI components. These include methods for motion planning, high level task planning and decision making as well as systems for making these technologies work on real products by building frameworks that take advantage of technologies in the field of reliable distributed computing.

We work with internal partners of different Bosch business units to transfer our solutions into future products. We also actively collaborate with leading groups in academia and industry to promote research ideas and publish research findings in internationally renowned conferences and journals such as CVPR, ICRA, IROS, RSS, NeurIPS and CoRL.

Job Description

As a Senior Research Scientist
- Vision-Language-Action (VLA) Models, you contribute to research projects at the forefront of the ADAS/AD industry.

Key responsibilities include:

* Conduct research and engineering in core AI and machine learning fields to enable Embodied AI (including computer vision, autonomous planning, open-world learning, and so on) for related business domains of ADAS/AD, industrial automation, robotics etc.

* Push the boundaries in (modular) end-to-end perception and planning for ADAS/AD, incorporating advancements in large vision-language-(action) models to aid reasoning capabilities and explainability.

* Collaborate cross-functionally with global research and engineering teams to ensure seamless technology transfer and system integration.

* Implement research results to solve real-world challenges, ensuring high-quality system integration within Bosch's existing platforms.

* Stay at the forefront of innovation by actively engaging with academic and industry communities through conferences, workshops, and technical events.

* Document and disseminate research findings through high-caliber publications and/or patent submissions.

Qualifications

Basic Qualifications

* Ph.D. in Computer Science, Robotics or a related discipline or Master's degree with = 2/4 years industry experience after graduation.

* A minimum of 5 years of R&D experience, or an equivalent graduate research background, primarily in AI technologies including Computer Vision and Robotic or Automotive Motion and Behavioral Planning.

* Proficiency in one or more programming languages commonly used in machine learning (e.g., Python, C++, Rust).

* Strong interpersonal, communication, and teamwork capabilities.

* Knowledge of major machine learning frameworks like Tensor Flow or PyTorch.

* Hands-on experience in reinforcement learning for behavior or motion planning or other applicable contexts and familiarity with common RL techniques (e.g. PPO, DQN, DDPG).

* A strong portfolio of publications in premier machine learning, deep learning, robotics and computer vision journals and conferences.

Preferred Qualifications

* Experience with real-world product development and deployment of autonomous systems.

* Hands-on experience building and applying multimodal transformer-based sequence-to-sequence models, especially multimodal vision-language-action models.

* Hands-on experience in computer vision and deep learning, with work in any of the following areas: multimodal transformers, multimodal language models, diffusion models, NeRF, gaussian splatting, object detection / segmentation, 3D scene understanding, sensor calibration, SfM, voxel/BEV grid-based feature representation.

Additional Information

We offer a competitive base salary for this position with a range in…