Operations Engineer DevOpsMLOps
Listed on 2026-03-03
-
IT/Tech
Systems Engineer, Cloud Computing, Data Engineer, SRE/Site Reliability
JOB LEVEL
P50
EMPLOYEE ROLE
Individual Contributor
We are seeking an experienced Staff Software Development Engineer with deep expertise in cloud infrastructure and a passion for building scalable, production‑grade ML systems. As part of the Applied Research and Technology Services organization, you will play a meaningful role crafting the operational backbone for high‑performance, reliable, and globally scaled machine learning services. In this role, you’ll work closely with multi‑functional collaborators.
These include Adobe Research, Adobe AI Platforms, and product engineering teams. Together, you will architect solutions that speed up innovation and improve service resilience. You will also provide technical leadership and define, document, and enforce guidelines adopted across teams. You will own technical direction for core service infrastructure and MLOps, influence architectural decisions across multiple teams, and raise the operational maturity of the organization through standards, reusable platforms, and mentorship.
You will evaluate and introduce new infrastructure, optimization, and agentic technologies with clear value and adoption plans. This position is ideal for someone who thrives at the intersection of Dev Ops, MLOps, systems engineering, and automation.
- Build and automate cloud infrastructure provisioning, scaling, and deployments using industry‑standard tools and infrastructure‑as‑code practices.
- Architect and implement end‑to‑end MLOps pipelines for packaging, deploying, and monitoring large‑scale ML services.
- Build and integrate telemetry agents to capture operational, performance, and inference metrics across distributed ML services.
- Build backend dashboards and observability workflows that surface quality, performance, traffic, and reliability insights for ML services.
- Lead the development of Agentic Ops solutions to optimize large‑scale ML production workflows, reduce MTTR, and increase service engineering productivity.
- Develop and maintain robust CI/CD pipelines (e.g., Git Lab CI, Git Hub Actions, Jenkins) enabling automated model conversion, optimization (PTQ/QAT), and artifact packaging.
- Drive standards in reliability, cost optimization, and operational readiness across service deployments.
- 8+ years of experience in Dev Ops, SRE, or cloud infrastructure engineering roles
- Demonstrated experience designing and managing MLOps life cycles
, including model deployment, inference optimization, and production monitoring. - Strong knowledge of CI/CD methodologies and tools such as Git Ops, Docker, Terraform, Git Hub Actions, Git Lab CI, or Jenkins.
- Hands‑on expertise with Kubernetes orchestration
, including frameworks such as Kubeflow, Argo Workflows, or similar systems. - Strong programming skills in Python
, with experience building automation tooling for ML or Dev Ops workflows. - Proficiency with observability and monitoring platforms (e.g., Prometheus, Grafana, Splunk, New Relic) for building reliable production systems.
- Experience optimizing distributed architectures for cost efficiency, reliability, and performance
. - Familiarity with deep learning frameworks (e.g.,
PyTorch, Tensor Flow
) and model optimization tools such as ONNX, Tensor
RT, TFLite, AOT
, etc., is a strong plus.
Our compensation reflects the cost of labor across several U.S. geographic markets, and we pay differently based on those defined markets. The U.S. pay range for this position is $159,200 -- $301,600 annually. Pay within this range varies by work location and may also depend on job‑related knowledge, skills, and experience. Your recruiter can share more about the specific salary range for the job location during the hiring process.
State‑SpecificNotices
- California:
Fair Chance Ordinances
Adobe will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and “fair chance” ordinances. - Colorado:
Application Window Notice
Feb 23 2026 12:00 AM – If this role is open to hiring in Colorado (as listed on the job posting), the application window will remain open until at least the date and time stated above in Pacific…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).