Senior ML Test Engineer - Inference & Validation Job San Mateo area,California USA,Software Development

Why Sony Interactive Entertainment?

Sony Interactive Entertainment isn't just the Best Place to Play - it's also the Best Place to Work. Sony Interactive Entertainment (SIE) is the company behind the Play Station brand. As a subsidiary of Sony Group Corporation, we're part of a proud legacy of innovation and excellence. SIE is a dynamic technology company, delivering cutting-edge hardware and network services to more than 100 million people and an entertainment leader, home to some of the most beloved and recognizable intellectual properties (IP) in the world.

Our role at SIE is to create and nurture the experiences under the Play Station brand, a name synonymous with entertainment excellence and creativity.

At Play Station, we're passionate about both partnership and technology. We care about building performant products and powerful tools, solving practical problems in delightful ways. We love building the systems that millions of people use, and we're always our own first customers and critics. We strive to be at the forefront of development technologies and celebrate our diversity with engineering teams located on four continents from all walks of life.

We build our teams to be places where every voice is heard, all input is valued, and our members have the creative freedom to make their ideas real.

Play Station is seeking a Sr Software Development Engineer in Test, ML to join our San Mateo team and lead quality efforts for machine learning-powered products and services. You will partner closely with ML engineers, software engineers, product managers, and data partners to build scalable test automation and validation approaches for systems serving Play Station at global scale. This role is focused on building confidence in model reliability, production readiness, and behavioral consistency.

Unlike traditional software backend testing, success here requires validating probabilistic systems through score distributions, cohort-level behavior, thresholds, and statistical signals, not just deterministic pass/fail assertions.

Responsibilities:

Design, develop, and maintain automated test frameworks, test scripts for ML inference services and related ML platform components using python/Java
Build and automate realistic, representative test account creations and datasets to evaluate model behavior across different scenarios
Validate score distributions, ranking behavior, output quality, and other model signals for incoming ML models prior to release
Perform different types of testing, including functional, integration, regression and API testing working with RESTful APIs, microservices and databases, and non-functional performance and load testing to verify inference service scalability, latency requirements and reliability under production like conditions
Work closely with ML engineers and ops teams to ensure that all features and bug fixes come with automated test coverage, ensuring continuous integration and deployment
Debug and analyze test failures, model anomalies and report defects
Drive high standards of quality for both engineering decisions and customer-facing features with optimal test and automation strategy
Develop and implement best practices for test automation that is highly scalable and maintainable
Monitor test execution and optimize performance in test environments
Come up with innovative solutions to organizational problems

Required Qualifications:

Bachelor's degree or equivalent
5+ years of experience as an SDET, with strong experience testing backend systems.
Proficiency in Python/Java/Scala/go for test automation.
Hands-on experience with API testing tools (e.g., Postman or similar) for distributed systems
Familiarity with CI/CD pipelines and test execution in Jenkins or similar environments such as jenkins, Git Hub Actions, ArgoCD, etc
Hands-on experience building and maintaining automated test frameworks (e.g., pytest, JUnit, TestNG, etc.)Experience in working with databases for backend validation.
Experience with cloud and container technologies (AWS, GCP, Kubernetes, Docker, etc.)
Familiarity with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, etc.) for validating service health
Strong understanding of software development lifecycle (SDLC) and agile methodologies.
Excellent problem-solving skills and excellent communication skills.

Preferred Qualifications:

Experience testing machine learning systems, such as recommendation systems, search ranking systems, or other probabilistic ML application systems.
Experience creating synthetic or seeded test data to simulate realistic customer/account behaviors.
Experience evaluating model outputs using statistical techniques, distribution analysis, or scenario-based validation, rather than deterministic assertions.
Knowledge of data pipelines, feature stores, inference systems or model-serving infrastructure (Seldon, KServe, Ray Serve, etc.) is a plus.
Experience testing online services with high RPS and low latency requirements
Experience with…