Cloud Engineer
Listed on 2026-01-14
-
IT/Tech
Cloud Computing, Systems Engineer, IT Support
We are UMG, the Universal Music Group. We are the world's leading music company. In everything we do, we are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.
Job Description How we LEAD:We are currently seeking an eager and collaborative individual to serve as the technical resource on Cloud Ops team. Responsible to provide technical support and guidance regarding cloud-based SAP solutions as well a being a key technical support resource in supporting the greater cloud landscape.
How you'll CREATE:- Utilize technical skills and tools to coordinate, maintain, enhance, and deploy cloud-based solutions while following corporate standards and best practices
- Work within a Cloud Ops team to meet project objectives in a timely fashion
- Update project status reports (progress/risks/issues/roadblocks) as required by Senior Management
- Use Cloud Ops methodology and toolsets to implement and support solutions in AWS, Azure, and GCP
- Implement proposed solutions to ensure the delivery of a quality product or service
- Ensure cloud resources meet UMG's operational requirements and are in compliance with UMG infrastructure and security standards
- Provide documentation and communication to peers and other IT Teams for status, coordination, objectives, and performance
- Compile and remediate vulnerabilities identified by the securities team
- Assess and troubleshoot, consult with vendors, and coordinate with other teams for problem resolution
- Ensure high work standards regarding incident and change management tasks
- Other tasks as deemed necessary or appropriate
- SUSE Linux Enterprise Server (SLES) Administration: Deploy, configure, and manage SLES EC2 instances in AWS; troubleshoot boot issues; manage LVM (LVS, VGS, PVS); configure SUID, SGID, Sticky Bit; expand XFS file systems and EBS volumes; manage permissions and ACLs. User/group management, SSH security, package management (zypper, rpm), process/service management (systemd), log analysis, networking (TCP/IP, DNS, routing, firewalls), backup/restore, patching and security hardening.
- AWS Services: EC2, S3, EFS, Storage Gateway, EBS, VPC, VPC Endpoints, IAM, Cloud Watch, Cloud Trail; provisioning, configuration, monitoring and optimization.
- Automation & Scripting
:
Bash/shell scripting, AWS CLI, Infrastructure-as-Code (Terraform/Ansible), cron jobs and system automation. - Monitoring & Performance
:
System resource monitoring, capacity planning, AWS monitoring tools, and troubleshooting performance bottlenecks.
- Proficient usage and management of Linux and Windows technology stacks (Ubuntu, Windows Server, Nginx, Apache, MySQL, PHP, IIS, MS SQL, .NET, and more)
- Proven experience with AWS, GCP, and Azure in an enterprise setting (personal use of these services will not be considered as equivalent)
- Experienced usage of Kubernetes technology stacks (K8s, EKS, GKE, Helm, Prometheus, Cortex, Grafana, Istio, and more)
- Accomplished with Modern
Cloud Ops methodologies and toolsets (Chef, Terraform, Jira, Slack, Vault, Python, Cortex) - Solid grasp and usage of version control and release management concepts and tools, particularly Git (Github / SVN/Bitbucket) for code branching and merging
- Understanding of Modern Auth (OIDC, SSO, SAML, Federation)
- Build and support solutions using both serverless and server-based infrastructure in AWS, Azure, and GCP cloud environments
- Follow deployment practices using CICD Processes and Technologies (Jenkins, Team City, Tekton, Spinnaker, Octopus, Code Deploy, Automate)
- Solid understanding of key cloud design concepts such as "High Availability" (HA), "Elastic Load Balancing" (ELB), Principle of Least Privilege, Resiliency, Ephemeral Computing, Stateless Computing, Virtual Networking, and Scaling
- Detailed knowledge and demonstrated experience with key AW…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).