Research Engineer Intern Speech Generation
Listed on 2026-06-04
-
Software Development
AI Engineer, Software Engineer
Genies is an avatar technology company powering the next era of interactive digital identity through AI companions. With the Avatar Framework and intuitive creation tools, Genies enables developers, talent, and creators to generate and deploy game-ready AI companions. The company’s technology stack supports full customization, AI-generated fashion and props, and seamless integration of user-generated content (UGC). Backed by investors including Bob Iger, Silver Lake, BOND, and NEA, Genies’ mission is to become the visual and interactive layer for the LLM‑powered internet.
About the Role: Genies is looking for a passionate Research Engineer Intern to join our core AI team at our Bay Area office in San Francisco, CA. You will solve fundamental technical challenges in creating socially‑aware, autonomous, and creative AI characters and companions. Working closely with a dedicated team of engineers and researchers, you will bridge the gap between cutting‑edge research and real‑world product impact.
You'll Be Doing
- Research, design, and implement state‑of‑the‑art Generative AI models for speech generation, including Text‑to‑Speech (TTS) and Voice Clone.
- Investigate and apply novel architectures such as Flow Matching or Autoregressive Transformers to improve audio fidelity and voice cloning capabilities.
- Develop techniques for disentangled representation learning to separate speaker identity from prosody and emotion, enabling highly controllable and expressive character voices.
- Optimize deep learning models for real‑time inference, focusing on latency reduction and throughput improvement.
- Currently pursuing a Master’s or PhD degree in Computer Science, Machine Learning, Electrical Engineering, or a related quantitative field.
- Demonstrated experience with deep learning frameworks (e.g., PyTorch) and Python programming.
- Solid understanding of digital signal processing (DSP) fundamentals and audio synthesis concepts.
- Experience with at least one relevant area:
Speech Synthesis, Voice Clone, or Generative Modeling.
- First‑author publications in top‑tier venues such as ICASSP, Interspeech, NeurIPS, ICML, or ICLR.
- Familiarity with model optimization techniques (quantization, distillation) and inference acceleration.
- Knowledge or experience integrating machine learning models into production environments.
- Previous internship or research experience in audio AI or related domains.
The full‑time position offers a starting salary range of $45 k to $55 k, plus potential equity compensation and a comprehensive health, wellness, and benefits package.
- Comprehensive health insurance for you and your family (Anthem + Kaiser options available)
- Dental and vision insurance
- Competitive salaries and 401(k) program
- Flexible paid time off, sick time, paid company holidays, paid parental leave, bereavement leave, and jury duty leave
- Health and wellness support through monthly wellness reimbursement
- Work in a brand‑new, bright, open‑environment office space with a slide
Genies is an equal opportunity employer committed to promoting an inclusive work environment free of discrimination and harassment. We value diversity, inclusion, and aim to provide a sense of belonging for everyone.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).