Site Reliability Engineer- Dev Operations
Listed on 2026-02-14
-
IT/Tech
Cloud Computing, Systems Engineer
Site Reliability Engineer
- Dev Operations - Hiring on W2 Hourly!! Pleasanton – CA
-Remote 12+ Months Contracts
Site Reliability Engineer (SRE) will be a lead on the Dev Ops team and is responsible for system administration areas including monitoring, installation, configuration, maintenance, operations, and architecture of AWS cloud environments and on premise environments. The candidate will work within a team in implementing and maintaining all production and pre-production environments by implementing tools and automation. Looking for a candidate with exceptional Site Reliability and Dev Ops skills and should have extensive knowledge and experience in implementing solutions and tools to maintain and grow all application environments.
Most importantly, the right individual will possess a positive, “can-do” attitude and a passion for delivering technical solutions in a fast-paced environment. In addition, the individual will be dedicated, independent, and collaborate at a high level in ensure the stability and reliability of infrastructure and applications running in the AWS Cloud and on premise environments. Advanced experience working in AWS environments will be expected while leading the implementing of improvements and advancements.
Haves
- Experience setting up alerts / alarms / notifications in AWS cloud. Cloud Watch / Dynatrace
- Experience with AWS solutions using AWS services including Kafka, ECS, EKS.
- Experience with IaC (Infrastructure as code) CDK or Terraform.
- Areas to Focus on
- AWS background
- Automation,
- Monitoring
- CI/CD
- Site Reliability background
- System Administration skill on Linux and OS
- 24/7 Environment support and troubleshooting
- 6+ years of overall IT experience
- 4+ years of AWS Cloud management experience with below skill set
- AWS Certified Dev Ops and / or Solution Architect certification
- Experience in AWS provisioning, operations, and management of AWS environments.
- Experience setting up alerts / alarms / notifications in AWS cloud. Cloud Watch / Dynatrace
- Experience with AWS solutions using AWS services including Kafka, ECS, EKS.
- Experience with IaC (Infrastructure as code) CDK or Terraform.
- Experience setting up / maintaining multi AZ infrastructure including HA and DR in AWS.
- Experience with code repositories Azure Dev Ops Server, GIT, GITLab, SVN
- Experience with continuous integration tools Jenkins, Azure Pipelines
- Excellent knowledge of Linux systems
- Experience with system automation and configuration management tools including Ansible
- Experience with Python scripting
- Strong background in networking, load balancing, and firewalls
- High-level understanding of networking standard protocols and components such as: HTTP, DNS, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing
- Thorough understanding of and experience with managing web applications in a highly available environment
- Experience in Software development is a plus
- Familiarity with deploying and configuring Java and .Net applications.
- Experience with Application Security Testing tools a plus (Coverity, Tenable, Black Duck, etc)
- Understanding of SQL, PL/SQL, and T-SQL commands
- Passion for AWS cloud architecture, provisioning, monitoring, and maintenance management
- Passion for improving software development processes and desires to automate any repetitive work you ever do. Familiarity with configuration management
- Enthusiasm for working closely with developers to understand ops requirements
- Experience with large project rollouts at an enterprise level
- Detailed knowledge of Windows and Linux Operating Systems
- Good knowledge of SCM (software configuration management)
- Working knowledge of web services, web application development, Oracle database Server, multi-tier application systems
- Good knowledge of software configuration, source control, and build engineering, scripting and system administration is required
- Strong troubleshooting and problem solving skills, including application and network-level troubleshooting ability
- Technical writing skills
- Knowledge/experience with troubleshooting installing and configuring SSL certificates
- Understanding of TCP/IP, UDP, IP ROUTING, SSH/SFTP/SCP, DNS, FTP, SMTP
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).