Sr Site Reliability Engineer
Listed on 2026-02-18
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability
Location: Santa Clara, California, US
Work Model: Hybrid (onsite three days per week)
Note: Relocation not offered. Ericsson Inc. does not sponsor U.S. work authorizations for this role (including H1B, O1, TN) and does not hire F1 candidates working on EAD.
Join a high-impact team that combines the agility of a startup with the resources of a global technology leader. Our BCSS CNE SPS Shared Development Cloud group builds the next generation of cloud infrastructure using Kubernetes, AI, and cutting‑edge cloud‑native technologies.
Senior Site Reliability Engineer (System Engineer V)You’ll work on meaningful challenges: designing resilient systems that power 5G/6G networks, leveraging LLMs to automate operations, and building infrastructure that thousands of developers depend on. If you enjoy solving complex technical problems and working with emerging technologies, this role offers significant opportunities for impact and growth.
BCSS:
Business Area Cloud Software and Services
CNE:
Core Networks Engineering
SPS:
Software Production System
As a Senior Site Reliability Engineer, you’ll design, develop, and operate Ericsson Web Services (EWS)—our internal cloud‑native platform built on Kubernetes. You’ll work hands‑on across the full infrastructure stack, from bare metal to container orchestration to AI‑powered automation.
Key Responsibilities- Design, architect, and implement cloud infrastructure for EWS, optimizing Kubernetes platforms for cloud‑native workloads across compute, storage, and networking layers.
- Build and maintain automation tooling for infrastructure provisioning, monitoring, and operations using Infrastructure as Code practices.
- Develop and operate SRE systems leveraging Large Language Models (LLMs) for AI‑driven operations and intelligent automation.
- Implement and maintain cloud security practices including vulnerability scanning, security monitoring, compliance automation, and incident response.
- Provide SRE on‑call support covering North American time zones.
- Engage with Ericsson’s AI communities and external ecosystems (academia, open source) to stay current with cloud and AI technologies.
- Collaborate across research, product development, architecture, and service teams on AI solutions for 5G/6G and IoT systems, including predictive operations, anomaly detection, and intelligent network analysis.
Required Technical
Skills:
- Cloud‑Native Infrastructure:
Deep hands‑on experience operating production‑grade Kubernetes environments, including troubleshooting, performance tuning, and capacity planning. - Linux Systems:
Strong Linux administration skills including systemd, networking (bridges, VLANs, routing), storage, and performance optimization. - Automation & IaC:
Proficiency in Python, Bash, and Go; experience with Ansible, Terraform, or similar automation frameworks. - Containerization:
Solid understanding of container runtimes (containerd, Docker), image management, and container orchestration. - Networking:
Experience with Kubernetes networking (CNI, Cilium, Calico), load balancing, and service mesh concepts. - Storage Systems:
Hands‑on experience with cloud‑native storage solutions (Ceph, NFS, object storage) and Kubernetes storage concepts (CSI, Storage Class, PVC). - Observability:
Experience with monitoring, logging, and alerting tools (Prometheus, Grafana, Loki, or similar). - CI/CD:
Experience with Git Lab CI, Jenkins, or similar platforms for building automated pipelines. - AI‑Powered Development:
Experience with AI‑assisted coding tools (Git Hub Copilot, Cursor, or similar) and LLM‑powered automation. - Bare Metal & Virtualization:
Knowledge of bare metal provisioning and virtualization technologies (KVM, libvirt). - Secret Management:
Familiarity with secret management solutions (Vault, Open Bao). - CNCF Ecosystem:
Understanding of CNCF landscape and ability to evaluate emerging cloud‑native technologies. - Database Operations:
Experience with database systems (MySQL, Redis, Postgre
SQL).
We value proven ability to quickly learn and master new technology domains. Candidates with strong fundamentals and demonstrated adaptability are encouraged to apply even if not experienced in all…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).