Platform - Site Reliability Engineer II; Networking
Listed on 2026-02-12
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability, Network Engineer
Platform - Site Reliability Engineer II (Networking)
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data Elastic Search AI Platform powers more than 50% of the Fortune 500 companies, combining the precision of search with the intelligence of AI to accelerate results that matter.
What isThe Role
As part of the Platform Engineering department’s Traffic team, you will craft, build, and improve the multi‑cloud platform for Elastic Cloud Hosted and Serverless. You’ll contribute to Kubernetes, Go/Scala, and custom orchestration architectures, working on coding, design, resilience, security, bug fixes, and features.
What You Will Be Doing- Lead technical initiatives that automate network engineering to guarantee reliability of global Elastic infrastructure.
- Develop and maintain software, tooling, and automation to grow platform infrastructure for increasing scale.
- Collaborate inclusively, focusing on operational excellence and uplifting teammates.
- Respond to and prevent recurring customer impact during major incidents and prioritized problem management. Participate in a follow‑the‑sun on‑call rotation.
- Experience in driving platform reliability with a “progress, not perfection” mindset; a customer‑first approach to operational problems.
- Software engineering background collaborating with engineers to design, implement, and deliver solutions; experience in public cloud and managed Kubernetes services is advantageous.
- Passion for inclusive communication and building strong partner and team relationships; experience working remotely or in distributed teams.
- Operated a SaaS product in public cloud using Infrastructure‑as‑Code tools such as Crossplane or Terraform.
- Built or operated Kubernetes‑at‑scale across multiple cloud providers with supporting automation.
- Written non‑trivial programs in Go or other languages.
- Worked with Docker‑based containerized services.
- Led and improved alerting, incident management, and metrics systems (Elastic Stack, Graphite, Prometheus, Influx).
- Professional Linux system administration on distributed systems at scale.
- Diagnosed or designed solutions with the Elastic Stack.
- Thrived in self‑organizing, globally distributed teams.
- Coached and mentored teammates to bring out the best in each other.
As a distributed company, diversity drives our identity. We strive for parity of benefits across regions, with flexible locations and schedules. Our benefits include competitive pay, health coverage for you and family, generous vacation, matching for charitable donations, paid volunteer hours, and parental leave of at least 16 weeks.
EEO and Disability StatementElastic is an equal‑opportunity employer committed to creating an inclusive culture that celebrates different perspectives. Qualified applicants will receive consideration without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender identity, national origin, age, marital status, protected veteran status, disability status, or any basis protected by federal, state, or local law. We welcome individuals with disabilities and will provide accommodations during the application process;
please email candid for support.
Please see for our privacy statement.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).