Site Reliability Engineer
Job in
Chicago, Cook County, Illinois, 60290, USA
Listed on 2025-12-01
Listing for:
Request Technology, LLC
Full Time
position Listed on 2025-12-01
Job specializations:
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Support
Job Description & How to Apply Below
Site Reliability Engineer
Hybrid (3 days onsite, 2 days remote) full‑time. No visa sponsorship. Base pay: $150,000 – $155,000 per year, subject to skills and experience.
A prestigious company seeks a Site Reliability Engineer focused on observation, logging, and capacity planning. The role requires experience with Linux, Kubernetes/Docker, Terraform, Jenkins, Ansible, Harness, and Kafka.
Responsibilities- Collaborate with development, operations and infrastructure teams to ensure availability of services, and to work through implementation issues
- Develop automation for incident response and to prevent problem recurrence
- Create and enhance runbooks to respond to service outages or degradations
- Assess the production readiness of services
- Define and track operational metrics for production performance, reliability, scalability and availability
- Architect, develop and maintain shared services and tools to improve reliability and reduce toil across the organization
- Bachelor’s or Master’s Degrees in Computer Science, Information Systems or another related field, or equivalent work experience
- Minimum of 4+ years of experience in Site Reliability Engineering / Dev Ops
- Experience with maintaining and troubleshooting large‑scale distributed systems
- Experience managing infrastructure in public cloud environments like AWS (preferred), Azure or GCP
- Experience with AIOps and predictive analysis for anomaly detection, forecasting system capacity using monitoring and alerting tools like Splunk, App Dynamics, Datadog, Stack Driver, Sysdig, Prometheus or Grafana
- Programming/scripting experience in languages like Java, Bash, Python or Go
- Experience with distributed messaging systems such as Kafka, Rabbit
MQ, or ActiveMQ - Experience with container orchestration systems such as Kubernetes, Mesos, Docker Swarm or Rancher
- Experience with CI/CD tools such as Jenkins, Travis, Harness, Appveyor, Code Build or Code Pipeline
- Familiarity with leveraging large language models (LLMs) to automate and optimize SRE workflows, including scripting, incident report summarization, or AI workload maintenance
Mid‑Senior
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×