Senior Site Reliability Engineer
Listed on 2026-01-29
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability
Overview
iHeart Media
The audio revolution is here - and iHeart is leading it! iHeart
Media, the number one audio company in America, reaches 90% of Americans every month, with a monthly audience that's twice the size of any other audio company and almost three times the size of the largest TV network. We are the home of many of the country s most popular on-air personalities and podcast influencers, and we create and produce some of the most popular branded live music events in America.
We reach almost every community in America and are committed to providing programming that reflects the diversity of the communities we serve. Our company values stress collaboration, curiosity, welcoming dissent, accepting mistakes in pursuit of new ideas, and respect for everyone.
If you re excited about this role but don t feel your experience aligns perfectly with the job description, we encourage you to apply anyway. iHeart
Media is dedicated to building a diverse, inclusive, and authentic workplace and is looking for teammates passionate about what we do.
The Senior Site Reliability Engineer will be responsible for leading a talented team of SREs/Dev Ops Engineers across a wide variety of Cloud Services. This person will be our leader as we move toward a platform / systems architecture and infrastructure that is highly automated, fully instrumented, self-scaling, self-healing and loosely coupled. Must be a go-getter with efficient multi-tasking abilities along with efficient people management skills.
WhatYou'll Do
- Standardize and modernize Amazon EKS platforms & AWS Serverless Suites, including all Cutting-Edge Managed Services from AWS adhering to Dev Ops best practices.
- Provide expertise and hands on implementation of large-scale, mission critical Kubernetes workloads with High Resiliency and multi-region architecture.
- Work collaboratively with 2 to 5 Site Reliability Engineers.
- Champion accountability; take responsibility through actions & thoughts.
- Design and implement end-end CI/CD pipelines with CDK and Code Pipeline, including integrating with source control, build tools and deployment targets like CFT stacks.
- Prioritize & re-align quickly to adapt to a demanding fast paced Shift Left environment.
- Maximize automation to improve speed and quality while relentlessly driving low-value, repetitive work out of our operational activities.
- Work with our application delivery teams to design and build scalable and maintainable solutions for our customers.
- Enforce Git Ops workflow where Git is the source of truth for EKS clusters and app state in a multi-account and multi-region environment (FluxCD/ArgoCD).
- Develop baselines for governance, consumption/cost and performance to ensure that our elastic cloud-based applications operate efficiently, securely and with zero down time.
- Run Reliability Incident management processes along with Root Cause Analysis, developing Runbooks, & Self-Healing architecture.
- Instill Standardization in Dev Ops processes across a wide range of applications.
- 6+ years of hands-on experience in public cloud specifically AWS.
- 3+ years of leading SRE/Dev Ops teams across complex AWS ecosystems.
- Deep understanding of high velocity SDLC best practices along with CI/CD & Application/infrastructure Monitoring practices to operate workloads at high scale.
- Expert proficiency in Kubernetes, Terraform, AWS CDK, Lambda, API Gateway, Route
53, S3, EC2, Load Balancing, Dynamo
DB, Cloud Watch, IAM, Networking, IOT, SQS, Event Bridge, etc. - Adept at solving & troubleshooting High volume Distributed architecture applications running on AWS.
- Demonstrated ability to design, build, and maintain AWS infrastructure using AWS CDK (Type Script preferred) with strong modular patterns (multi-stack, multi-account, multi-region).
- Strong understanding of Git Ops methodologies, experience in implementing and managing multiple environments through declarative configuration management versioned in Git repos and applied via automated tools like Flux or ArgoCD.
- Hands-on experience managing large-scale, production EKS clusters across multiple regions and AWS accounts.
- Deep knowledge of AWS Cost optimization techniques such as Reserved Instances, Spot Instances, and Life Cycle Management.
- Proven ability to build highly secure AWS Infrastructure with a security first mindset.
- Proven ability to collaborate and build strong relationships with development teams including Conflict Resolutions & driving decisions/initiatives.
- Strong software development background including knowledge of microservices architecture along with fluency in JavaScript, Type Script, or Node.
JS or Python. - At least one among the following AWS
Certifications:
AWS Solution Architect Associate; AWS Solution Architect Professional; AWS Dev Ops Associate; AWS Dev Ops Professional;
Professional Kubernetes Certifications.
- Respect for others and a strong belief that others should do this in return
- Expertise with various technical disciplines and applications
- Close…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).