Sr Devops Engineer
Listed on 2026-06-02
-
IT/Tech
Systems Engineer, SRE/Site Reliability
About the Role
We’re seeking a Senior Dev Ops Engineer with deep infrastructure expertise to design, build, and operate the foundational systems that power our products. The role focuses on reliability, scalability, automation, and operational excellence across cloud and on‑prem environments. You’ll work closely with engineering teams to evolve our infrastructure platform, reduce toil, and ensure our systems are robust, observable, and efficient.
Key Responsibilities- Architect and maintain core infrastructure systems across compute, storage, networking, and cloud services.
- Develop automation and tooling to eliminate manual operations and improve system consistency.
- Implement and manage infrastructure‑as‑code using modern frameworks and best practices.
- Drive reliability engineering practices including SLOs, SLIs, error budgets, and incident response.
- Enhance observability through metrics, logging, tracing, and actionable alerting.
- Optimize system performance and capacity to support growth and cost efficiency.
- Lead complex troubleshooting efforts across distributed systems and production environments.
- Collaborate with engineering teams to ensure infrastructure supports evolving product needs.
- Strengthen security and compliance posture through hardened infrastructure and best practices.
- Mentor engineers and contribute to a culture of operational excellence.
- 8+ years of experience in SRE, Dev Ops, or infrastructure engineering with hands‑on ownership of production systems.
- Deep knowledge of Linux systems, networking fundamentals, and distributed systems.
- Proficiency with IaC tools such as Terraform, Ansible, and Cloud Formation.
- Experience with containerization and orchestration (Docker, Kubernetes).
- Solid programming or scripting skills in Python, Go, Bash, or similar.
- Hands‑on experience with CI/CD systems and automated deployment pipelines.
- Strong observability background using Prometheus, Grafana, ELK, Open Telemetry, or similar.
- Proven incident management experience in high‑availability environments.
- Experience with hybrid or multi‑cloud environments.
- Knowledge of infrastructure security including secrets management and zero‑trust principles.
- Background with large‑scale distributed systems or high‑throughput architectures.
- Open‑source contributions in SRE, infrastructure, or cloud‑native ecosystems.
- Highly reliable, scalable infrastructure that supports rapid product growth.
- Reduced operational toil through automation and self‑service capabilities.
- Clear, actionable observability enabling fast detection and resolution of issues.
- Efficient, cost‑optimized systems across cloud and on‑prem environments.
- A strong reliability culture across engineering teams.
It is the policy of F5 to provide equal employment opportunities to all employees and employment applicants without regard to unlawful considerations of race, religion, color, national origin, sex, sexual orientation, gender identity or expression, age, sensory, physical, or mental disability, marital status, veteran or military status, genetic information, or any other classification protected by applicable local, state, or federal laws.
This policy applies to all aspects of employment, including, but not limited to, hiring, job assignment, compensation, promotion, benefits, training, discipline, and termination. F5 offers a variety of reasonable accommodations for candidates.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).