Software Engineer - AI/ML Systems and Reliability
Listed on 2026-05-20
-
Software Development
Cloud Engineer - Software, AI Engineer
Adobe is looking for a Staff Software Engineer – AI/ML Systems, MLOps & Reliability to build and scale the platform powering Adobe Experience Platform's Personalization ML solutions and Generative AI capabilities.
This role sits at the intersection of software engineering, MLOps, infrastructure, and reliability engineering. You will help design and operate the foundational platform that enables scalable model training, reliable inference, automated ML workflows, and production‑grade AI systems for enterprise‑scale personalization use cases.
Partnering closely with engineering, product, and data science teams, you will build systems that support intelligent audience creation, journey optimization, and personalization will join a collaborative and highly technical team of engineers and scientists with deep expertise in distributed systems and machine learning.
The ideal candidate enjoys building platform capabilities for ML systems and operating highly reliable cloud‑native infrastructure. This is a hands‑on role where you will contribute across MLOps platform development, distributed systems engineering, Dev Ops automation, and production reliability.
What You'll Do AI/ML Platform & MLOps- Architect and build infrastructure for AI/ML systems, including Personalization and Generative AI platforms.
- Design and build MLOps capabilities such as model deployment pipelines, feature stores, model registries, and inference infrastructure.
- Partner with ML engineers and data scientists to product ionize ML models and workflows.
- Build scalable platform services and APIs supporting multiple teams and products.
- Improve reliability, scalability, observability, and operational efficiency of distributed AI systems.
- Build monitoring, alerting, logging, and tracing solutions for production services.
- Develop CI/CD pipelines, deployment automation, and infrastructure‑as‑code tooling.
- Troubleshoot production issues and drive operational excellence for cloud‑native services.
- Design highly available systems that scale horizontally.
- Lead technical design and architecture discussions across teams.
- Participate in design, development, testing, code reviews, deployment, and production support.
- Evaluate and adopt emerging technologies in AI and ML infrastructure and distributed systems.
Required Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
- 8+ years of software engineering experience building distributed systems.
- Curiosity and bias to action.
- Strong programming skills in Python or Java.
- Experience with microservices, REST APIs, and cloud‑native architectures.
- Experience with AWS or Azure, Kubernetes, and Docker.
- Experience with CI/CD, infrastructure automation, and production operations.
- Strong understanding of reliability, scalability, and observability for distributed systems.
- Strong troubleshooting, communication, and collaboration skills.
- Experience with MLOps platforms or ML infrastructure.
- Hands‑on experience with Generative AI applications.
- Familiarity with Ray, Kafka, Spark, Airflow, or similar distributed systems technologies.
- Experience with relational and No
SQL databases such as MySQL, Postgre
SQL, Redis, Elasticsearch, or Snowflake. - Experience supporting high‑throughput, low‑latency production systems.
Adobe is proud to be an Equal Employment Opportunity employer. We do not discriminate based on gender, race or color, ethnicity or national origin, age, disability, religion, sexual orientation, gender identity or expression, veteran status, or any other protected characteristic.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).