Cloud Operations Engineer
Listed on 2025-12-01
-
IT/Tech
Cloud Computing, Cybersecurity, Systems Engineer, IT Support
About Coalfire
Coalfire is on a mission to make the world a safer place by solving our clients’ hardest cybersecurity challenges. We work at the cutting edge of technology to advise, assess, automate, and ultimately help companies navigate the ever-changing cybersecurity landscape. We are headquartered in Chicago, Illinois with offices across the U.S. and U.K., and we support clients around the world.
But that’s not who we are – that’s just what we do.
We are thought leaders, consultants, and cybersecurity experts, but above all else, we are a team of passionate problem-solvers who are hungry to learn, grow, and make a difference.
Position SummaryWe’re looking for a Site Reliability Engineer to join the Coalfire team. If you’re driven by a desire to innovate, excel at operational excellence, and thrive in a collaborative environment, come be part of a team committed to making the world a safer place.
What You’ll Do- Hands‑on engineering work, including developing new deployments, automation scripts, and tooling to meet client needs.
- Manage and maintain patch management processes, ensuring timely updates, security compliance, and system stability across cloud and on‑prem environments.
- Oversee Identity and Access Management (IAM), implementing and enforcing security best practices to protect sensitive data and ensure proper access controls.
- Perform cloud administration and system administration tasks, such as provisioning resources, optimizing performance, and troubleshooting infrastructure issues.
- Collaborate with senior engineers and solutions architecture teams to address complex technical issues, ensuring timely resolutions and maintaining client satisfaction.
- Adhere to established quality standards for engineering deliverables, aligning with internal protocols, compliance regulations, and project deadlines.
- Identify and communicate potential risks, working with relevant stakeholders to incorporate mitigation strategies that meet regulatory and client expectations.
- Contribute to day‑to‑day project tasks, including tracking progress, providing updates, and ensuring assigned activities are completed on schedule.
- 3–5 years in systems engineering and architecture, including requirements gathering, basic architecture development, systems integration, and testing
- 3–5 years in cloud computing (AWS, Azure, or GCP), covering design, deployment, operations, and basic automation
- 3–5 years working with Infrastructure‑as‑Code (for example, Terraform, Ansible) to provision and manage cloud resources
- Experience meeting SLAs through effective issue identification, escalation, and resolution in a fast‑paced environment
- Proven track record of contributing to operational improvements (for example, automating workflows, enhancing monitoring) and supporting compliance requirements (for example, FedRAMP)
- Experience participating in project definition and documentation, including planning, design reviews, and post‑implementation summaries
- Managed Services Expertise:
Familiarity with ticket management systems and meeting SLA requirements in a managed services environment - Cloud and Automation:
Hands‑on experience with AWS, Azure, or GCP; working knowledge of Terraform, Ansible, Git Lab, and CI/CD technologies - Technical
Collaboration:
Proven ability to work alongside Site Reliability Engineers and cross‑functional teams, contributing to team problem‑solving and performance improvements - Soft Skills:
Strong interpersonal, organizational, and problem‑solving skills; capable of building trust with internal stakeholders and clients - Documentation and Communication:
Skilled at creating technical diagrams and clear written documentation; able to convey complex ideas effectively - Professionalism and Autonomy:
Demonstrated ability to manage individual tasks, balance priorities, and maintain a professional attitude in both independent and team settings - Security Mindset:
Critical thinker capable of meeting security and compliance requirements without compromising operational objectives
- Serverless and Modern Architectures:
Exposure to serverless, microservices, containerization, or other modern application frameworks - Network…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).