Lead Site Reliability Engineer; SRE – AWS/Linux/Windows
Listed on 2026-06-27
-
IT/Tech
Systems Administrator, Cloud Computing: Infrastructure & Operations, Unix/Linux
Lead Site Reliability Engineer (SRE) – AWS/Linux/Windows
Location: Hartford, CT (Hybrid – 3 days’ work from office)
Experience:
10+ years
Please note, this role is not able to offer visa transfer or sponsorship now or in the future
Job SummaryWe are seeking a highly capable Compute RE Lead to drive and manage compute operations across Linux/Unix, Windows, and AWS environments
. This role will lead the compute team, provide technical guidance, and leverage Reliability Engineering (RE) principles to improve efficiency and reduce operational toil.
The ideal candidate will have strong expertise in Linux (preferred) along with working knowledge of Windows
, enabling them to effectively guide both teams and customers.
- Lead the compute function across Linux/Unix, Windows, and AWS environments
- Provide technical guidance and mentorship to the compute team
- Identify opportunities to reduce toil by applying RE principles and automation
- Act as the primary point of contact for compute-related discussions with stakeholders
- Perform day-to-day administration of Linux systems (RHEL, CentOS, Ubuntu)
- Install, configure, patch, and maintain operating systems and services
- Monitor system performance and troubleshoot issues
- Manage users, permissions, and access controls
- Develop automation scripts using Bash/Python
- Provide guidance and oversight for Windows-based environments
- Support system operations, patching, and troubleshooting
- Design, deploy, and manage AWS services (EC2, S3, IAM, VPC, EBS, RDS, Cloud Watch, Cloud Formation)
- Manage IAM roles and ensure least-privilege access
- Monitor and optimize cloud performance and costs
- Support hybrid cloud environments and migration initiatives
- Implement backup, disaster recovery, and high availability solutions
- Ensure high availability, performance, and security of systems
- Develop and maintain documentation, runbooks, and SOPs
- Participate in incident management and on-call support
- Collaborate with Dev Ops and development teams on infrastructure needs
- Ensure compliance with security and audit requirements
- Strong experience in Linux administration (RHEL/CentOS/Ubuntu)
- Working knowledge of Windows Server environments
- Hands‑on experience with AWS cloud services
- Experience in scripting and automation (Bash/Python)
- Good understanding of networking concepts (TCP/IP, DNS, firewalls, load balancers)
- Proven ability to lead teams and drive technical initiatives
The annual salary for this position is between $79,240 $ 130,000 depending on experience and other qualifications of the successful candidate.
This position is also eligible for Cognizant’s discretionary annual incentive program, based on performance and is subject to the terms of Cognizant’s applicable plans.
Benefits- Medical/Dental/Vision/Life Insurance
- Paid holidays plus Paid Time Off
- 401(k) plan and contribution
- Long-term/Short-term Disability
- Paid Parental Leave
- Employee Stock Purchase Plan
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).