Site Reliability Engineer; SRE – Operations Job Kansas City area,Missouri USA,IT/Tech

Position: Site Reliability Engineer (SRE) – Operations
As a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and has 27,000+ employees in 35 countries. Some 20,000 financial services and healthcare organizations, from the world's largest companies to small and mid-market firms, rely on SS&C for expertise, scale, and technology.
** Job Description
***
* Job Title:

Senior Site Reliability Engineer - Operations
**** Locations**:
Kansas City, MO | Hybrid | MO, TX, GA, and FL | remote
** Get To Know The Team:
** We are seeking a highly skilled
** Site Reliability Engineer (SRE)
** to join our
** Operations
* * team. In this role, you will be responsible for ensuring the availability, performance, scalability, and reliability of our systems and services. You will work closely with infrastructure, Engineering, Dev Ops and security teams to build robust systems, automate operations, and implement best practices for incident response, monitoring, and disaster recovery.
** Why You Will Love It Here!**
* ** Flexibility**:
Hybrid Work Model & a Business Casual Dress Code, including jeans
* ** Your Future:
** 401k Matching Program, Professional Development Reimbursement
* ** Work/Life Balance:
** Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays
* ** Your Wellbeing:
** Medical, Dental, Vision, Employee Assistance Program, Parental Leave
* ** Diversity & Inclusion:
** Committed to Welcoming, Celebrating and Thriving on Diversity
* ** Training:
** Hands-On, Team-Customized, including SS&C University
* ** Extra Perks:
** Discounts on fitness clubs, travel and more!
** What You Will Get To Do:**
* ** System Reliability & Performance** + Maintain and improve the uptime, performance, and availability of production systems. + Define and track
** SLIs**,
** SLOs**, and
** SLAs
* * to ensure service reliability and user satisfaction.
* ** Monitoring & Incident Response** + Implement and manage monitoring, alerting, and observability tools (e.g., Prometheus, Grafana, Datadog, ELK). + Participate in on-call rotations and respond to incidents, performing root cause analysis and postmortems.
* ** Automation & Tooling** + Automate repetitive tasks and processes using scripts, configuration management, and Infrastructure as Code (IaaC). + Develop CI/CD pipelines to streamline deployment and operational processes.
* ** Capacity Planning & Scaling** + Analyze system performance and capacity trends to plan for future growth. + Collaborate with engineering teams to design systems that scale reliably.
* ** Infrastructure Management** + Support cloud and/or hybrid infrastructure (AWS, Azure, GCP, VMware, etc.). + Manage system provisioning, configuration, and patching via tools such as Ansible, Terraform, or Puppet.
* ** Collaboration & Culture** + Act as a bridge between development and operations teams, championing Dev Ops and SRE principles. + Contribute to a culture of continuous improvement, reliability, and accountability.
** What You Will Bring:
*** Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
* 3+ years of experience in a Site Reliability, Dev Ops, or Systems Engineering role.
* Experience with
** Linux/Unix systems**,
** Windows**, shell scripting, and administration.
* Proficiency in at least one programming/scripting language (Python, Go, Bash, etc.).
* Hands-on experience with cloud platforms (
** AWS**,
** Azure**, or
** GCP**).
* Strong knowledge of networking, security, load balancing, and DNS.
* Experience with monitoring/logging tools (e.g., Prometheus, Grafana, ELK, Splunk, Datadog).
*
* Preferred Qualifications:

*** Experience with containerization and orchestration tools (e.g.,
** Docker**,
** Kubernetes**).
* Familiarity with ITIL processes, incident/change/problem management frameworks.
* Exposure to compliance and security standards (e.g., ISO 27001, SOC 2, HIPAA).
* Experience in large-scale distributed systems and microservices architectures.
*
* Soft Skills:

*** Strong analytical and problem-solving skills.
* Excellent communication and collaboration abilities.
* Calm under pressure, especially during incidents and outages.
* Passion for automation,…


Increase/decrease your Search Radius (miles)



Job Posting Language