Production Support Engineering LMTS
Listed on 2026-06-02
-
Software Development
AI Engineer, Cloud Engineer - Software, DevOps, Software Engineer
Description
Opportunity & Product
Join an agile team with deep startup roots. We operate as a high‑velocity ‘startup‑within‑Salesforce,’ following our recent acquisition. You’ll be managed by the same founders and engineers who built the original company, offering the autonomy of a small team backed by the global scale and trust of Salesforce.
We have successfully moved past the "0 to 1" phase. We have a product that works, customers who love it, and the backing of Salesforce. Now, we are entering the "1 to 100" phase: scaling our architecture to handle global demand, hardening our systems for enterprise‑grade resilience, and integrating deeply with the Agentforce ecosystem. This is your chance to help lead that transition.
What You’ll DoAs a Production Support Engineer (LMTS), you will be a senior technical lead within our embedded reliability team. You aren’t building the foundation alone—you’ll work alongside a group of engineers and product owners to ensure the Agentforce for Supply Chain platform is the most reliable AI‑powered engine in the industry.
This is a role for an engineer who loves the "scaling" problem. You will focus on production excellence, performance tuning, and infrastructure automation. Because you are embedded in the product organization, you’ll have a seat at the table during design reviews, ensuring that as we add new agentic capabilities, they are built to scale from day one.
Responsibilities- Scaling & Reliability:
Own the reliability roadmap for major product areas, working to transition our systems from startup‑speed architectures to highly‑available, global‑scale enterprise solutions. - Collaborative Leadership:
Partner with PMTS‑level engineers to refine our infrastructure strategy, contributing senior‑level perspectives on system design, capacity planning, and bottleneck identification. - Infrastructure as Code:
Maintain and evolve our automated environments, focusing on making our "infrastructure‑as‑plugins" model more robust and developer‑friendly. - AI Operations (AIOps):
Support the scaling of our AI/ML infrastructure, ensuring our models have the GPU resources and data pipelines required to deliver real‑time supply chain insights. - Production Excellence:
Lead the "1 to 100" hardening of our observability stack. You won’t just respond to incidents; you’ll build the tooling that prevents them and the telemetry that explains them. - Performance Engineering:
Deep‑dive into SQL optimization, API latency, and cross‑service communication to ensure our data‑intensive supply chain platform remains performant under heavy load. - AI‑First Workflow:
Lean into the future of engineering by using AI tools (Claude Code, etc.) to automate routine operational tasks and accelerate infrastructure delivery. - Contribute to building and maintaining the shared system context, an explicit repository of system designs, constraints, and standards that enables AI to operate accurately and reliably.
- Critically evaluate code (Human or AI‑generated) for correctness, quality, security, and performance.
- 5+ years of experience in SRE, Production Engineering, or Backend Engineering with a heavy focus on operations and scale.
- Proven Scaling
Experience:
You have previously helped take a product through a high‑growth phase (the "1 to 100" journey), dealing with the technical debt and architectural shifts that come with it. - Technical Breadth:
Strong proficiency in Kubernetes, Terraform/Open Tofu, and AWS/GCP/Azure. - Coding Mastery:
Ability to write and review production‑level code in Golang, Type Script, or Python—you view automation as a software engineering problem. - Systems Expert:
Deep understanding of distributed systems, including how to debug complex interactions between microservices, databases, and AI agents. - Low‑Ego
Collaboration:
Experience working within a senior team of Principal engineers, capable of both leading specific initiatives and supporting the broader group’s technical vision. - A demonstrated, genuine AI‑first approach to engineering. Using AI to move faster, build fluency across the stack, and contribute well beyond your core specialty.
- Experience using AI tools (e.g., Claude Code, Git…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).