More jobs:
Platform Engineer
Job in
Los Altos, Santa Clara County, California, 94024, USA
Listed on 2026-06-14
Listing for:
JazzX AI
Full Time
position Listed on 2026-06-14
Job specializations:
-
IT/Tech
SRE/Site Reliability, Systems Engineer, Cloud Computing: Infrastructure & Operations, IT Infrastructure
Job Description & How to Apply Below
We are looking for a highly experienced Staff Platform Engineer to lead the architecture, automation, scalability, and operational maturity of our platform infrastructure and engineering ecosystem. This role requires deep expertise across cloud infrastructure, distributed systems, networking, Dev Ops, platform engineering, security, and intelligent automation.
As a senior technical leader, you will drive platform strategy, define engineering standards, mentor engineers, and build highly scalable and resilient systems that improve developer productivity, operational efficiency, security, and customer experience.
Platform Architecture, Engineering & Reliability- Architect and build scalable, secure, and highly available cloud-native platforms on Microsoft Azure
- Design and manage infrastructure automation using Terraform, Git Ops, and Infrastructure as Code (IaC)
- Build and improve CI/CD systems, deployment orchestration, environment provisioning, and self‑service developer platforms
- Design secure networking and authentication systems using Cloudflare, API gateways, private networking, and Zero Trust principles
- Drive platform reliability, scalability, observability, security, and operational excellence across production environments
- Lead incident management, root cause analysis, operational governance, deployment reliability, and resiliency improvements
- Build AI/LLM and Agentic operational workflows for troubleshooting, automation, operational insights, and intelligent platform management
- Improve operational efficiency through automation, intelligent tooling, anomaly detection, and cloud cost optimization
- Strong hands‑on coding experience in Python, Go, or modern programming languages with strong system design and problem‑solving fundamentals
- Deep understanding of cloud-native architectures, distributed systems, networking, CI/CD, infrastructure automation, and platform engineering principles
- Experience building internal developer platforms, self‑service systems, operational tooling, and AI-assisted engineering workflows
- Strong understanding of SDLC, Dev Ops practices, Agile methodologies, release governance, and production operational models
- Strong troubleshooting and debugging skills across infrastructure, cloud, deployment, networking, and application layers
- Partner with engineering, architecture, security, and product teams to define platform standards and long-term technical strategy
- Improve developer experience, platform adoption, deployment reliability, and operational maturity across teams
- Lead architecture discussions, operational reviews, and production readiness initiatives
- Mentor engineers and drive engineering best practices across platform and infrastructure domains
- Drive engineering culture around automation, scalability, reliability, operational excellence, and platform standardization
- Strong communication, collaboration, stakeholder management, and technical leadership skills
- 10+ years of experience in Platform Engineering, Infrastructure Engineering, Dev Ops, SRE, or Cloud Engineering
- Extensive experience managing large-scale production environments on Microsoft Azure
- Experience with Kubernetes, container platforms, Git Ops, Terraform, and cloud-native systems
- Experience building enterprise-grade CI/CD and deployment automation platforms
- Experience driving operational excellence, observability, reliability engineering, and infrastructure automation initiatives
- Experience leading cross-functional technical initiatives and mentoring engineers
- Experience with AI/LLM workflows, coding agents, and intelligent automation systems is a strong plus
- Terraform, Git Hub Actions / Azure Dev Ops, CI/CD pipeline design, Infrastructure as Code (IaC), Git Ops, Python / Go or any modern programming language
- Cloudflare Gateway / WAF, API security, authentication & authorization, SSO integration (Okta, Entra ), network security and Zero Trust concepts
- Monitoring and alerting, logging and tracing, incident management, root cause analysis, reliability engineering
- LLM-based operational workflows, Agentic automation systems, AI-assisted incident analysis and remediation, intelligent anomaly detection and operational insights
- Kubernetes ecosystem tools, Service Mesh, vector databases, MCP/Agentic orchestration frameworks, Fin Ops and cloud cost optimization tools
- Kubernetes ecosystem tools, Service Mesh, vector databases, MCP/Agentic orchestration frameworks, Fin Ops and cloud cost optimization tools
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×