×
Register Here to Apply for Jobs or Post Jobs. X

Sr SRE; Ansible

Job in Bellevue, King County, Washington, 98009, USA
Listing for: Insight Global
Full Time position
Listed on 2026-01-09
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, Cybersecurity, IT Support
Job Description & How to Apply Below
Position: Sr SRE (Ansible)

Job Description

We're building automated disaster recovery failover solutions to ensure high availability across this enterprise telecom company's critical infrastructure. This is a fast-paced, high-impact role where you'll design and implement failover automation spanning applications, databases, and network layers.

We're looking for quick learners who deliver fast and leverage AI-assisted development to accelerate outcomes. You’ll work closely with database teams, application owners, and network engineers to build a robust automation framework that supports multi-database failover, network rerouting, and application-level resilience. This person will assist in the effort to create a push-button failover system that enables real-time disaster recovery across their critical applications.

You will help create a dashboard-drive automation suite that empowers teams to manage failovers with confidence, reduce toil, and improve customer experience during outages.

Responsibilities
  • Build Ansible playbooks and Git Lab CI/CD pipelines for automated failover workflows, eventually migrating to AAP Platform as failover orchestration layer
  • Independently onboard applications into the failover framework—gather requirements, understand architecture, and implement with minimal hand holding from app teams
  • Automate database failover (Oracle, Mongo

    DB, Postgre

    SQL, MSSQL) and messaging systems
  • Integrate with Cyber Ark, Hashi Corp Vault, and F5 load balancers (GTM/LTM)
  • Create Service Now change automation and observability dashboards
  • Proactively engage application owners and drive conversations to unblock delivery
  • Design and implement observability solutions—build monitoring dashboards, alerting, and health-check mechanisms to provide real-time visibility into failover readiness and execution
  • Recommend and establish best practices—evaluate current processes, identify gaps, and propose improvements for failover patterns, automation standards, and operational runbooks
  • Document everything—create clear, comprehensive technical documentation, architecture diagrams, runbooks, and onboarding guides that enable team scalability and knowledge transfer
  • Build reusable automation frameworks—develop modular, maintainable automation components that can be extended across applications and environments
You Are
  • Self-driven – You take ownership, find answers yourself, and don't wait to be told what to do next
  • Fast learner – You ramp quickly on new tools and ecosystems with minimal guidance
  • Independent operator – You can engage app teams directly, extract what you need, and fill gaps through your own research
  • Delivery-focused – You ship iteratively and thrive in ambiguity
  • Relationship builder – You build trust with stakeholders and drive conversations forward
  • Strong communicator – You document well and proactively flag blockers
  • Continuous improver – You don't just execute; you identify what's broken and propose better ways of doing things
  • Knowledge sharer – You believe documentation is a first-class deliverable, not an afterthought

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances.

If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to

To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:

Skills and Requirements
  • 5+ years in SRE, Dev Ops, or Infrastructure Automation
  • Ansible & Git Lab CI/CD expertise
  • Python/Bash scripting; strong YAML skills
  • AWS and Kubernetes experience
  • Familiarity with secret management (Cyber Ark, Vault)
  • Experience using AI coding tools (Claude, Copilot, ChatGPT) to accelerate delivery
  • Strong documentation skills—ability to translate complex systems into clear, actionable guides F5 GTM/LTM and network failover experience
  • Chaos engineering background
  • Service Now automation experience
  • Telecom or large enterprise environment experience
  • Experience with observability platforms (Splunk, Dynatrace, Grafana, Prometheus)
  • Track record of establishing automation standards and best practices in enterprise environments
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary