×
Register Here to Apply for Jobs or Post Jobs. X

Senior Kubernetes Engineer - Scientific & Agentic Workflow Platforms

Job in Menlo Park, San Mateo County, California, 94029, USA
Listing for: SLAC National Accelerator Laboratory
Full Time position
Listed on 2026-06-03
Job specializations:
  • Software Development
    AI Engineer, Data Scientist
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

About The Role

Do you want your Kubernetes clusters to do more than serve web traffic? At SLAC, our infrastructure powers the discovery of new materials, the mapping of the universe, and the understanding of fundamental physics.

The Application and User Services (AUS) group within the Scientific Computing Services Division manages the platforms that underpin science  build and operate the systems that let researchers focus on discovery rather than infrastructure. We are now seeking a Senior Kubernetes Engineer to help design and implement a scalable, next‑generation platform purpose‑built for scientific and agentic workflows.

This role is not just about managing pods and nodes—it is about building the computational engines that allow scientists to peer into atomic structure, catalog billions of galaxies, and increasingly, to deploy intelligent autonomous agents that drive the next generation of experimental science. You will stand at the intersection of cloud‑native engineering and Nobel‑prize caliber research, collaborating within SLAC and across the broader Department of Energy (DOE) complex, Stanford University, and partner institutions worldwide.

Scientific experiments like the Vera

C. Rubin Observatory
and LCLS generate data at rates that challenge the limits of modern infrastructure. AI‑driven agentic workflows—pipelines where autonomous agents orchestrate complex, multi‑step scientific analyses—are rapidly becoming a core part of how experiments are designed, run, and interpreted. You will help us build and maintain the platform that makes all of this possible.

Key Responsibilities Platform Architecture & Engineering
  • Design, build, and operate highly available Kubernetes‑based platforms optimized for scientific and agentic workloads
  • Architect scalable solutions for high‑throughput data pipelines, real‑time streaming, and batch scientific computing
  • Design and implement platform primitives for agentic workflow orchestration—enabling autonomous, multi‑step AI‑driven pipelines that support experimental science
  • Develop cloud‑native architectures supporting on‑premises, hybrid cloud, and multi‑cluster deployments
  • Build and maintain Infrastructure‑as‑Code using tools such as Helm, Kustomize, and Git Ops workflows
  • Evaluate and introduce new technologies and patterns that advance the platform's capabilities for the scientific community
Agentic & AI Workflow Enablement
  • Lead platform design for agentic scientific workflows—systems where AI agents autonomously orchestrate data acquisition, analysis, and experimental feedback loops
  • Collaborate with researchers and data scientists to define platform requirements for running large language model‑driven and reinforcement learning agents at scale
  • Implement infrastructure patterns for agent orchestration frameworks (e.g., multi‑agent pipelines, tool‑use APIs, memory and state management) within Kubernetes
  • Ensure the platform supports the latency, throughput, and accelerator requirements of agentic workloads
  • Build guardrails, observability, and governance tooling suited to autonomous scientific agents operating on sensitive experimental data
Scientific Project Support
  • Partner with scientists and researchers—at SLAC and across DOE labs and universities—to design and implement solutions for major scientific programs, including:
    • Vera

      C. Rubin Observatory / LSST:
      Petabyte‑scale nightly sky surveys requiring real‑time alert pipelines and long‑running batch analysis for dark matter and dark energy research
    • LCLS (Linac Coherent Light Source):
      Real‑time analysis infrastructure for the world's brightest X‑ray laser, capturing femtosecond‑scale dynamics of matter
    • Cryo‑EM:
      High‑throughput 3D reconstruction pipelines for structural biology at near‑atomic resolution
    • Accelerator Operations:
      Monitoring, control, and data acquisition infrastructure for particle accelerators
    • American Science Cloud:
      National‑scale scientific data infrastructure to democratize access to computing resources across National Laboratories
    • Emerging Initiatives:
      Co‑design of infrastructure for next‑generation scientific computing programs not yet fully defined
  • Support the full project lifecycle—from initial technical…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary