×
Register Here to Apply for Jobs or Post Jobs. X

Cloud SRE Lead and Major Incident Digital Transformation Evangelist

Job in Reston, Fairfax County, Virginia, 22090, USA
Listing for: Peraton
Full Time position
Listed on 2026-01-01
Job specializations:
  • IT/Tech
    IT Project Manager, Cloud Computing
Job Description & How to Apply Below

Responsibilities

We are seeking an empowered, experienced, deeply technical, and highly motivated Cloud SRE Lead and Major Incident Digital Transformation Evangelist to join our dynamic team.

In this customer facing role you, as the Major Incident Digital Transformation Evangelist, will evaluate current processes, design a modern incident management approach, evangelize and gain agreement across teams, and execute the digital transformation to modern incident management.

As the Cloud SRE Lead, will play a critical role in ensuring the reliability, scalability and performance of our cloud infrastructure on Amazon Web Services (AWS) and guide the daily activities of the SRE team. You will lead an SRE team by executing daily standups, ensuring work is pulled from multiple sources, and surfacing blockers. This Lead role will collaborate closely with cross‑functional teams, including development, quality assurance, service desk, network, database, and operations, to ensure seamless software releases and continuous improvement of our release processes.

Major

Incident Digital Transformation Evangelist will do:
  • Execute Ideation Sessions: Execute ideation sessions across multiple teams and companies to identify areas of improvement and ideas to improve and radically change the current incident management process.
  • Establish Modern Incident Management Tooling: Review of currently available tools and industry best‑of‑breed to recommend and champion the right tool and technology and the right capabilities to empower, visualise, communicate, and activate cross‑functional teams.
  • Lead Major Incidents: Coordinate and lead the Major Incidents by directing the troubleshooting, communicating status, encouraging action, guiding the use of tools, and ensuring swift and complete resolution of the Major Incident.
  • Guide Postmortem Analysis: Schedule and lead blameless postmortems encouraging independent ideas, identification of true root causes, and communication of findings.
Cloud SRE Responsibilities:
  • Infrastructure Automation: Design, implement, and manage infrastructure as code (IaC) solutions using tools like AWS Cloud Formation, Terraform or Helm Charts to automate deployment and scaling processes. Collaborate with development teams to integrate continuous deployment practices and ensure the reliability of applications.
  • Monitoring and Alerting: Implement robust monitoring and alerting systems to proactively identify and address potential issues before they impact system performance. Analyse system metrics, logs, and alerts to troubleshoot and resolve issues promptly.
  • Performance Optimization: Conduct performance analysis and optimisation of AWS infrastructure components to enhance system efficiency and reduce latency. Identify and implement improvements to enhance system reliability and resilience.
  • Incident Response: Participate in on‑call rotations to respond to and resolve incidents promptly. Conduct post‑incident reviews to identify root causes and implement preventive measures.
  • Security and Compliance: Work closely with security teams to implement and enforce best practices for securing AWS environments. Ensure compliance with industry standards and regulations related to cloud infrastructure.
  • Communication: Facilitate clear communication across teams, providing updates on release status, known issues, and any potential impact on stakeholders. Coordinate communication of release schedules and changes to all relevant parties.
  • Release Planning and Coordination: Collaborate with development, QA, and operations teams to plan and coordinate software releases. Define release scope, schedule, and dependencies to ensure timely and smooth deployments. Create and submit change records as required for process and audit compliance. Participation in Technical Change Advisory and Review boards as required.
  • Release Automation: Develop and maintain automated deployment pipelines using industry‑standard tools such as AWS Cl/CD, Git Lab CI/CD, Jenkins or similar. Automate and streamline release processes to improve efficiency and reduce manual errors.
  • Continuous Improvement: Proactively identify areas for process improvement within the…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary