More jobs:
Systems Engineer - Automation
Job in
Atlanta, Fulton County, Georgia, 30383, USA
Listed on 2026-02-10
Listing for:
Tier4 Group
Full Time
position Listed on 2026-02-10
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing
Job Description & How to Apply Below
Overview
Position: Site Reliability Engineer (SRE) - Infrastructure
Location: Atlanta, GA
Employment Type: Full-Time
Work Arrangement: Onsite Hybrid
The Site Reliability Engineer (SRE) will ensure the reliability, scalability, and performance of enterprise applications and services across cloud and on-premises environments. This role focuses on automation, monitoring, and incident response to minimize downtime and enhance operational efficiency. The position requires close collaboration with development, quality assurance, and operations teams to deliver secure and resilient systems.
What You Will Do- Design, build, and maintain secure, compliant infrastructure using Infrastructure as Code tools such as Terraform and Ansible
- Automate provisioning and management of servers, storage, networks, Kubernetes clusters, and related systems across cloud and on-premises environments
- Develop tools and processes for automated deployment, configuration, monitoring, and alerting
- Collaborate with cross-functional teams to implement scalable and reliable cloud and data center solutions
- Participate in incident response, on-call rotations, and post-incident reviews to improve system resilience
- Monitor system performance and availability using service-level agreements (SLAs), objectives (SLOs), and indicators (SLIs); proactively troubleshoot and resolve reliability, performance, or security issues
- Create and maintain disaster recovery and business continuity plans for critical systems
- Continuously analyze and improve infrastructure efficiency, scalability, and performance
- Stay current with emerging technologies and recommend tools or practices to enhance platform capabilities
- Share technical expertise and mentor team members to strengthen internal capabilities
- Required Qualifications
- Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent experience
- Proven experience as a Site Reliability Engineer or Systems Engineer
- Strong proficiency in Terraform and Ansible for infrastructure automation
- Hands-on experience with Kubernetes, Docker, or other container orchestration tools
- Proficiency in scripting languages such as Python or Bash
- In-depth knowledge of Google Cloud Platform (GCP) services including compute, networking, storage, Kubernetes, and security
- Solid understanding of VMware virtualization and enterprise storage systems (e.g., Pure Storage)
- Experience with networking technologies including VLANs, VPNs, and routing protocols
- Strong grasp of IT infrastructure and operations principles, including systems integration and automation best practices
- Excellent communication and collaboration skills
- Ability to manage multiple priorities under pressure with strong problem-solving skills
- Preferred Qualifications
- Relevant certifications such as ITIL, PMP, or CISSP
- Experience in regulated or enterprise environments
- Communication and collaboration across technical and business teams
- Problem-solving and analytical thinking
- Ownership and accountability for system reliability
- Adaptability to emerging technologies and changing business needs
- Leadership and mentorship within technical teams
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×