×
Register Here to Apply for Jobs or Post Jobs. X

Principal DevOps Engineer - ML​/AI Algorithms

Job in Pleasanton, Alameda County, California, 94566, USA
Listing for: F. Hoffmann-La Roche Gruppe
Full Time position
Listed on 2025-12-09
Job specializations:
  • IT/Tech
    Cloud Computing
Salary/Wage Range or Industry Benchmark: 162600 - 302000 USD Yearly USD 162600.00 302000.00 YEAR
Job Description & How to Apply Below

At Roche you can show up as yourself, embraced for the unique qualities you bring. Our culture encourages personal expression, open dialogue, and genuine connections, where you are valued, accepted and respected for who you are, allowing you to thrive both personally and professionally. This is how we aim to prevent, stop and cure diseases and ensure everyone has access to healthcare today and for generations to come.

Join Roche, where every voice matters.

The Position

Principal Dev Ops Engineer - ML/AI Algorithms

Developing software is great, but developing software with a purpose is even better! As a Principal Dev Ops Engineer - ML/AI Algorithms, you will work on products that help people with the most precious thing they have — their health. You will be part of the RIS Research & Development team contributing to digital health products touching Imaging, ML/AI, and computational science.

The

Opportunity

As Principal Dev Ops Engineer, you will collaborate with important stakeholders on the development of the build, release, and deploy toolchain for Dev Ops, paving the way for seamless and efficient software delivery processes.

Location

This role can be based in Santa Clara (primary location) or in secondary locations (Mississauga, Canada or Basel, Switzerland).

Key Responsibilities
  • Lead the initiative to set up, manage, and meticulously maintain parity across development, staging, and production application environments in cutting-edge cloud infrastructure, ensuring a robust and consistent deployment pipeline.

  • Champion the implementation of advanced monitoring infrastructure development, empowering the team with real-time insights and ensuring the highest levels of system reliability and performance.

  • Provide dedicated on-call support for production operations, ensuring the uninterrupted delivery of critical services and swift resolution of any operational issues.

  • Interface with software developers, product managers, test engineers and administrators on projects to design and develop the build, release, and deploy toolchain for Dev Ops while providing on‑call support.

  • Identify, troubleshoot and resolve issues quickly and effectively, sometimes under pressure.

  • Actively involved in planning, high availability engineering, performance tuning, and automation/tools development.

  • Manage multiple releases with focus on system reliability, scalability, and efficiency.

  • Implement and manage the full lifecycle of machine learning models, including versioning, deployment strategies (e.g., canary, A/B testing), monitoring for drift and performance, and decommissioning.

  • Bring in leadership quality to improve technology and process of devops as well as provide mentorship to other devops engineers in the team.

Who You Are
  • Bachelor's degree in Computer Science, Engineering, or a related field with a minimum of 8+ years of experience in a Dev Ops or equivalent combination of education and experience to perform at this level.

  • 8+ years of experience with container technology, including Kubernetes, AWS EKS, Helm Charts, Splunk, and Docker, along with provisioning infrastructure through IAC using Terraform and cloud automation principles.

  • Proficiency in Unix/Linux administration in Shell scripting and internals with a preference for Ubuntu.

  • Deep working experience and extensive knowledge in building and deploying infrastructure using IaC frameworks such as terraform and AWS Cloud formation/SAM.

  • Experience building and automating scalable data pipelines for ingesting, transforming, distributed computing and versioning large‑scale image datasets.

  • Familiarity with Dev Ops practices and proficiency in log analysis and monitoring tools are essential for effective troubleshooting and system optimization.

  • Proficiency in Python for automating production systems, including Git, Gitlab, Git actions, Git Hub CI/CD, familiarity with common ML libraries such as Tensor Flow, PyTorch, and scikit‑learn to understand the engineering needs of the ML models you will be deploying.

  • Strong working knowledge of AWS Cloud infrastructure, including EC2, S3, API Gateway, Kubernetics, RDS, VPC peering, Route
    53, S3, IAM, Batch, Lambda, AWS Config and Autoscaling.

Pr…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary