Senior AIML Optimization Engineer
Listed on 2026-01-06
-
IT/Tech
AI Engineer, Cloud Computing
The Onyx Research Data Tech organization is GSK’s Research data ecosystem which has the capability to bring together, analyze, and power the exploration of data partner with scientists across GSK to define and understand their challenges and develop tailored solutions that meet their needs. The goal is to ensure scientists have the right data and insights when they need it to give them a better starting point for and accelerate medical discovery.
Ultimately, this helps us get ahead of disease in more predictive and powerful ways.
Onyx is a full‑stack shop consisting of product and portfolio leadership, data engineering, infrastructure and Dev Ops, data / metadata / knowledge platforms, and AI/ML and analysis platforms, all geared toward creating a next‑generation, metadata‑ and automation‑driven data experience for GSK’s scientists, engineers, and decision‑makers, providing best‑in‑class AI/ML environments, and aggressively engineering our data at scale to unlock its value in real‑time.
We’re looking for a highly skilled Senior AIML Optimization Engineer to help us make this vision a reality.
Key Responsibilities:- Serve as a key engineer for the optimization team and contribute technical expertise to teams in closely aligned technical areas such as Dev Ops, Cloud, and Infrastructure.
- Lead design of major optimization software components of the Compute and AIML Platforms, contribute to development of production code, and participate in design and PR reviews.
- Be accountable for delivery of scalable solutions to the Compute and AIML Platforms that support the entire application lifecycle, with particular focus on performance at scale.
- Partner with AIML and Compute platform teams and scientific users to help optimize and scale scientific workflows using deep knowledge of software and underlying infrastructure (networking, storage, GPU architectures).
- Participate in or lead scrum teams and contribute technical expertise to closely aligned technical areas.
- Design innovative strategies and ways of working to create a better environment for end users, and construct a coordinated, stepwise plan to bring others along during change.
- Act as a standard bearer for proper ways of working and engineering discipline, including CI/CD best practices, and proactively spearhead improvement within the engineering area.
- Bachelor’s, Master’s or PhD degree in Computer Science, Software Engineering, or related discipline.
- 6+ years of experience as a Computer Engineer, or 4+ years with a Master’s, or 2+ years with a PhD, using specialized knowledge in cloud computing, scalable parallel computing paradigms, software engineering, and CI/CD.
- 2+ years of experience in AIML engineering, including large‑scale model training and production deployment.
- Deep experience using at least one interpreted and one compiled industry programming language (e.g., Python, C/C++, Scala, Java) with tool chains for documentation, testing, and operations/observability.
- Deep experience with application performance tuning and optimization in parallel and distributed computing paradigms and communication libraries such as MPI, OpenMP, Gloo, and a deep understanding of underlying systems (hardware, networks, storage).
- Deep expertise in modern software development tools and ways of working (e.g., git, Git Hub, Dev Ops tools, metrics/monitoring).
- Deep cloud expertise (e.g., AWS, Google Cloud, Azure), including infrastructure‑as‑code tools (Terraform, Ansible, Packer) and scalable cloud compute technologies (e.g., Google Batch, Vertex AI).
- Expert understanding of AIML training optimization, including distributed multi‑node training best practices and practical experience accelerating training jobs.
- Understanding of ML model deployment strategies, including agent systems and scalable LLM inference systems deployed in multi‑GPU, multi‑node environments.
- Experience with CI/CD implementations using git and common CI/CD stacks (e.g., Azure Dev Ops, Cloud Build, Jenkins, Circle
CI, Git Lab). - Experience with Docker, Kubernetes, and the larger CNCF ecosystem, including application deployment tools such as Helm.
- Experience…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).