Sr. Software Engineer/DevOps
Listed on 2026-02-24
-
IT/Tech
Systems Engineer, Cloud Computing, IT Support, SRE/Site Reliability
Job Title: Sr. Software Engineer/Dev Ops
Location: San Francisco, CA
AboutThe Role
you’ll work closely with our Infrastructure and Platform teams to manage, improve, and scale the systems that power our products. Your focus will be on ensuring our infrastructure is reliable, observable, and easy to operate — with an emphasis on automation, operational excellence, and cross‑functional collaboration.
You’ll help build and maintain the foundational infrastructure that supports our SaaS applications, including Kubernetes, Terraform-managed cloud resources, and Git Hub-based CI/CD pipelines. While incident response is part of the role, the primary focus is on proactive improvements: reducing operational toil, improving visibility into system behavior, and enabling product teams to move fast with confidence.
What You'll Do- Infrastructure Management:
Build, manage, and optimize infrastructure using Terraform, Git Hub CI/CD, and Kubernetes. - Monitoring & Observability:
Create visualizations and alerts that provide actionable insights using tools like Grafana, Prometheus/Mimir, Open Search, and Sentry. - Automation & Reliability:
Identify manual or error-prone processes and replace them with automated, repeatable systems. - Production Troubleshooting:
Diagnose and resolve production issues across application and infrastructure layers. - Documentation:
Capture knowledge in runbooks, setup guides, and architecture diagrams to support operational maturity. - Collaboration:
Partner with engineers across teams to drive adoption of Dev Ops and infrastructure best practices. - Scalability Planning:
Help scale infrastructure and monitoring systems to meet growing demands. - Incident Participation:
Participate in an on-call rotation and support incident response processes as needed.
- Observability:
Experience with metrics, logs, and traces using tools such as Grafana, Prometheus/Mimir, Open Search, Sentry, or similar. - Infrastructure as Code:
Proficient with Terraform, Kubernetes, and containerization tools. - Programming
Skills:
5+ years of experience with Python. - Linux Systems:
Comfortable working with Linux-based environments and writing shell scripts. - Communication:
Strong collaboration skills with a focus on asynchronous, written communication. - Documentation:
Commitment to clear, comprehensive documentation and process standardization. - Initiative:
Self-starter mindset with a proactive approach to solving operational challenges. - Version Control:
Skilled in Git/Git Hub-based workflows.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).