×
Register Here to Apply for Jobs or Post Jobs. X

Senior Research Scientist - Generative Video

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Canva
Full Time position
Listed on 2026-01-04
Job specializations:
  • Software Development
    Data Scientist, AI Engineer
Job Description & How to Apply Below
Senior Research Scientist - Generative Video

Join us to apply for the Senior Research Scientist – Generative Video role at Canva.

Where and How You Can Work

Our head office is in Sydney, Australia, but San Francisco is now home to our US operations. The role is listed as hybrid, meaning we are flexible and empower you to work where you prefer – whether that’s at home or at the office.

Job Description

About the Role

At Canva, we’re building a future powered by AI that’s as magical as it is impactful. As a Senior Research Scientist (Generative Video), you’ll help push the boundaries of video generation and editing—turning cutting‑edge research into practical, scalable capabilities that empower millions of creators.

This role blends hands‑on applied research with strong technical ownership. You’ll design, train, and evaluate generative video models, collaborate closely with engineering and product partners, and help translate breakthroughs in video diffusion and multimodal learning into real‑world experiences in Canva.

• Own and deliver research projects that advance Canva’s generative video capabilities (text‑to‑video, image‑to‑video, video‑to‑video, video editing).

• Design and run rigorous experiments to validate hypotheses, improve quality, cont rollability, temporal coherence, and runtime performance.

• Develop and improve model architectures and training pipelines for video generation, including diffusion‑based approaches and complementary techniques.

• Translate research into production impact by partnering with ML engineers to scale training/inference and integrate models into Canva’s product ecosystem.

• Advance evaluation and benchmarking for generative video, including perceptual quality, motion fidelity, temporal consistency, identity preservation, prompt adherence, safety, and robustness.

• Explore data strategies for video (curation, filtering, deduplication, captioning/annotation, synthetic data, bootstrapped labeling) that improve model reliability and cont rollability.

• Contribute to the research roadmap by tracking emerging trends, proposing new directions, and identifying high‑leverage problems in generative video.

• Share knowledge through internal write‑ups, talks, cross‑team reviews, and (where appropriate) external publications or conference engagement.

You’re Probably a Match If You

You thrive in ambiguity, love connecting deep research to product outcomes, and can independently drive meaningful research work from idea to deployment. You balance scientific rigor with practical delivery, communicate clearly with cross‑functional partners, and have strong instincts for what will make models useful.

We’re Looking For Someone Who Brings

• Deep expertise in generative video modeling, including strong familiarity with modern approaches such as:

• Video diffusion (latent diffusion for video, spatiotemporal U‑Nets/DiTs, conditional diffusion, guidance strategies, scheduler choices).

• Temporal modeling techniques (3D/2+1D convs, temporal attention, factorized attention, optical‑flow‑aware modeling, recurrent/streaming approaches).

• Cont rollability methods (Control Net‑style conditioning for video, pose/depth/segmentation conditioning, motion control, camera control, keyframes, masks, and edit constraints).

• Consistency and identity preservation (subject‑consistent generation, reference‑based conditioning, feature/embedding locking, token/adapter strategies, multi‑view constraints where relevant).

• Efficient training and adaptation (LoRA/adapters, distillation, latent‑space tricks, progressive training, multi‑stage pipelines, mixed precision, distributed training).

• Longer‑horizon video generation strategies (hierarchical generation, chunked/overlapped sampling, latent caching, frame interpolation, consistency models, or hybrid autoregressive + diffusion pipelines).

In Addition, You Have

• Experience developing and deploying generative AI systems (video synthesis/editing strongly preferred; multimodal systems also valuable).

• Strong working knowledge of multimodal representation learning (video‑text, video‑image, VLM‑style conditioning, retrieval‑augmented conditioning).

• A solid publication record…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary