×
Register Here to Apply for Jobs or Post Jobs. X

Research Scientist, Generative Worlds

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Google DeepMind
Full Time position
Listed on 2026-01-01
Job specializations:
  • IT/Tech
    Artificial Intelligence
  • Engineering
    Artificial Intelligence
Job Description & How to Apply Below
Snapshot

Help us build generative models of the 3D world. World models power numerous domains, such as media generation, visual reasoning, simulation, planning for embodied agents, and real-time interactive experiences. Work with us to build better versions of Gemini, Genie, and Veo, while also exploring new, spatial modalities beyond images and videos.

The Role

Key responsibilities:

Conduct research to build generative multimodal models of the 3D world. Solve essential problems to train world models at massive scale: build and train large-scale systems for data annotation, curate and annotate training datasets, build and maintain large model training infrastructure, develop scaling ladders and training recipes, develop metrics for spatial intelligence, enable real-time interactive experiences, study the integration of spatial modalities with multimodal language models, and of course: actually train massive-scale models.

Areas of focus:

• 3D computer vision, spatial annotation systems

• Spatial representations

• Generative pixel and latent models

• Infrastructure for large-scale data pipelines and annotation.

• Quantitative evals for spatial accuracy and intelligence.

About you

We seek individuals who are passionate about large-scale generative models and believe spatial understanding and generation are on the path to intelligence. We strive for simple methods that scale and look for candidates excited to improve models through infrastructure, data, evals, and compute.

In order to set you up for success as a Research Scientist/Engineer at Google Deep Mind, we look for the following skills and experience:

• MSc or PhD in computer science or machine learning, or equivalent industry experience.

• Experience with large-scale transformer models and/or large-scale data pipelines.

• Track record of releases, publications, and/or open source projects relating to video generation, world models, multimodal language models, or transformer architectures.

• Exceptional engineering skills in Python and deep learning frameworks (e.g., Jax, Tensor Flow, PyTorch), with a track record of building high-quality research prototypes and systems.

• Demonstrated experience in large-scale training of multimodal generative models.

In addition, the following would be an advantage:

• Experience building training codebases for large-scale video or multimodal transformers.

• Expertise optimizing efficiency of distributed training systems and/or inference systems.

• Strong background in 3D representations or 3D computer vision

• Strong publication record at top-tier machine learning, computer vision, and graphics conferences (e.g., NeurIPS, ICLR, ICML, SIGGRAPH, CVPR, ICCV).

• A keen eye for visual aesthetics and detail, coupled with a passion for creating high-quality, visually compelling generative content.

#JLjbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary