Research Scientist Graduate; Foundation Model-Speech-Interaction & Learning PhD Job San Jose area,California USA,Engineering

Position: Research Scientist Graduate (Foundation Model-Speech-Interaction & Learning) - 2026 Start (PhD）
The Seed-Speech Team empowers interactive content and creative expression through cutting-edge speech and audio technologies. We focus on foundational model research and engineering across speech recognition, synthesis, and transformation, as well as full-duplex dialogue, real-time voice interaction, music generation, and multimodal integration. Recent innovations such as Seed-Live Interpret (low-latency speech-to-speech translation), Seed-Realtime Voice (expressive conversational voice modeling), and Seed-Music (controlled music creation) highlight our mission to push the boundaries of natural, seamless, and creative human-AI communication.

We are looking for talented individuals to join our team in 2026. As a graduate, you will get opportunities to pursue bold ideas, tackle complex challenges, and unlock limitless growth. Launch your career where inspiration is infinite cessful candidates must be able to commit to an onboarding date by end of year 2026. Please state your availability and graduation date clearly in your resume.

Responsibilities:
· Contribute cutting-edge research to Byte Dance product evolution (e.g., Tik Tok, Douyin, Cap Cut) to impact billions of users worldwide.
· Work on advanced science and technology in audio processing and generation (e.g., Dialogue Systems, Audio-Video Models, Speech Synthesis, Voice Conversion, Audio Codec Learning, Audio Language Modeling, etc.)
· Research, model, design, develop and evaluate novel machine learning models and algorithms.
· Collaborate with globally based researchers and engineering teams in developing machine learning models and algorithms.

Minimum Qualifications - PhD graduate with a background in computer science, machine learning, or similar fields.

- Good knowledge of theoretical and empirical research in addressing research problems - Solid knowledge and experience with at least one popular deep learning framework (e.g., PyTorch, Tensor Flow) and familiarity with deep neural network architectures - Experience in both neural and non-neural, classical machine learning models and algorithms - Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.

Preferred Qualifications - Good presentation and communication skills - Research experience in one or more of the following fields: speech synthesis, audio generation, large language model, computer vision, generative models - Strong first-author publications record in top AI conferences or journals(e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL etc.) - Proficient in C / C + +, Python, and shell programming languages, and have a deep understanding of data structure and algorithm design.

- Internship experience in an AI research organization By submitting an application for this role, you accept and agree to our global applicant privacy policy, which may be accessed here:


Increase/decrease your Search Radius (miles)



Job Posting Language