Senior Data Scientist - Generative AI Job San Mateo area,California USA,IT/Tech

Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences– all created by our global community of developers and creators.

At Roblox, we’re building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device. We’re on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there.

A career at Roblox means you’ll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.

WHY DATA SCIENCE & ANALYTICS?

The Data Science & Analytics organization's mission is to increase our speed, frequency, and acumen in making decisions at scale by instilling a data-influenced approach to building products. We cover a wide area of the data spectrum, including analytical data engineering, product analytics, experimentation, causal inference, statistical modeling, and machine learning. Aligned and partnered with product verticals, we use this extensive tool belt to discover new opportunities and unmet use cases, influence and craft the product roadmap, and prioritize, build data products, and measure impact on our community of players and developers.

WHY GENERATIVE AI?

The Foundation AI group’s mission is to enable Roblox Creators to accelerate their workflows and bring GenAI capabilities to millions of users. We envision a future where experiences on Roblox leverage generative text and speech to enable new interactions, and generative 3D and 4D capabilities to empower new creative workflows and user experience.

As a Data Scientist on the team, you will design, build, and operationalize evaluation for GenAI systems, and work with cross-functional teams to improve model performance and the AI data generation flow. Since AI evaluation is core to GenAI safety, quality, and iteration speed, we are building rigorous and scalable human and model-based evaluation systems that guide product decisions and model improvement.

You’ll combine annotation analysis, design of experiments, causal inference, product analytics, and model-based evaluation methods (such as LLM-as-a-judge / VLM-as-a-judge) to measure quality, safety, and user satisfaction—and translate these findings into model and product improvements. You’ll also help develop groundbreaking methodologies and tools that advance AI evaluation at Roblox and set industry standards. Beyond AI evaluation, we proactively explore opportunities and solutions to improve the AI model and data generation flow.

Additionally, we will build agentic workflows and AI agents for data solutions that enable teams to effectively access data, extract data insights, follow best practices, and make data-informed decisions.

If you are a self-starter who is curious, rigorous, and passionate about building innovative solutions that deliver real business value—and thrive in a dynamic, collaborative environment, this role is for you.

You Will:

Develop and improve evaluation frameworks for GenAI features (text, image, 3D, 4D, agentic workflow), including eval experiment design, eval dataset design, label reliability analysis, results analysis, and online evaluation based on user behavior and feedback.
Establish best practices and guidelines for GenAI evaluation.
Conduct product analytics, online experiments (A/B tests) and causal analyses to quantify GenAI feature impact and identify opportunities.
Build automated evaluation systems, such as research and implement LLM-as-judge and VLM-as-judge methods.
Research and apply state-of-the-art methodologies in GenAI evaluation.
Advance reproducible evaluation tooling to lift evaluation rigor and efficiency at the company.
Proactively explore and develop solutions to improve the AI model and data generation flow, ensuring high-quality input for training and deployment.
Design and implement agentic workflows and AI agents to enable teams to effectively access data,…


Increase/decrease your Search Radius (miles)



Job Posting Language