More jobs:
Platform Engineer: Cloud Infra, CI/CD & Automation
Job in
Frisco, Collin County, Texas, 75034, USA
Listed on 2026-05-29
Listing for:
T-MOBILE USA, Inc.
Full Time
position Listed on 2026-05-29
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing, IT Support, Cybersecurity
Job Description & How to Apply Below
At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees get the same big love we give our customers. All team members receive a competitive base salary and compensation package - this is Total Rewards. Employees enjoy multiple wealth-building opportunities through our annual stock grant, employee stock purchase plan, 401(k), and access to free, year-round money coaches. That's how we're UNSTOPPABLE for our employees!
Areyou ready to join the Un-carrier movement?
The Platform Engineer is essential for designing and optimizing infrastructure that supports internal services and platforms within the organization. It involves building resilient and scalable systems to ensure robust infrastructure and enterprise-wide governance. The role focuses on implementing continuous integration and continuous deployment (CI/CD) pipelines and enhancing data platform interoperability. Success is measured by system reliability, efficient software deployment, and seamless technology migrations and integrations.
The work impacts the organization by enabling adaptability and operational excellence in a dynamic technological environment.
We are a team that encourages innovation and advocates an agile and open approach, truly working and playing in the Un-carrier way!
Job Responsibilities
:
- Designs and develops resilient and scalable infrastructure systems
- Own day-to-day operations of hybrid infrastructure supporting T-Mobile's platform services.
- Ensure uptime, performance, and security across on-prem, AWS, and Azure environments.
- Troubleshoot complex infrastructure, configuration, and deployment issues impacting platform reliability.
- Lead patching, updates, and configuration management with minimal oversight.
- Participate in on-call rotation and drive improvements in incident response and postmortem practices.
- Optimizes existing infrastructure to enhance functionality and interoperability
- Ensure system reliability, performance, and security through proactive monitoring, automation, and performance tuning
- Troubleshoot complex platform and application integration issues impacting performance or availability
- Develop, execute, and tune DML logic - queries, data migrations, transformations, and batch operations - for performance and reliability
- Drive incident analysis and reliability reviews to improve operational posture and system resilience
- Implements and maintains CI/CD pipelines for efficient software deployment
- Support containerization and orchestration technologies, including Docker and Kubernetes, to standardize deployment practices
- Design, develop, and maintain automation using Python, Bash, or Power Shell to increase operational efficiency.
- Implement and manage Infrastructure-as-Code (IaC) using Terraform, Ansible, or equivalent frameworks.
- Build and maintain CI/CD pipelines for platform infrastructure and environment deployments (Git Lab CI/CD, Jenkins).
- Establish documentation standards and reusable modules for consistent automation delivery.
- Collaborates within Agile teams to drive continuous improvement and operational excellence
- Contribute to Agile ceremonies, driving infrastructure readiness and delivery excellence
- Advocate for Dev Ops and automation best practices across engineering teams
- Participate in on-call rotations and lead incident resolution and root cause analysis
- Lead efforts in automation for self-healing, scaling, and performance tuning
- Facilitates seamless migrations and integrations across different technologies
- Develop and manage observability solutions leveraging Prometheus, Grafana, Cloud Watch, Azure Monitor, or ELK
- Implement proactive monitoring, log analysis, and metrics-based alerting for early issue detection
- Collaborate with SREs to improve mean time to resolution (MTTR) and overall platform reliability
- Execute and optimize DML operations within Postgres, Oracle, and Cassandra environments under existing organizational DDL structures
- Develop and maintain integrations with Kafka for event-driven data pipelines, message publishing, and asynchronous workloads
- Implement caching strategies (Redis, in-memory caches) to reduce query latency and improve application performance
- Support database…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×