Senior Technical Lead - DevOps
Listed on 2025-11-25
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Support
- Provide consulting services for improved system stability, availability, performance and reliability.
- Assist in determining the impact of operational issues and provide input into their resolution via data extraction and quantification.
- Work through day-to-day support issues, ensure effective and timely resolution of issues in production environment, troubleshoot customer impacting issues.
- Support multiple applications, specifically running Kubernetes, Gloo, AWS, Apigee, PCF, GCP/Java based systems in an enterprise environment.
- Support Gloo running on Kubernetes, Apigee opdk and saas, Grafana, Prometheus, Cassandra, Postgres, Spring Boot or Java based applications running on Kubernetes, PCF, and Java application servers.
- Apply Git Ops principles to manage infrastructure and application configurations.
- Apply monitoring and create complex alerts and dashboards for production systems.
- Provide capacity analysis and tuning analysis for Apigee and Java applications hosted on LINUX and container platform.
- Available to provide 24X7 on-call support on a rotating basis with other team members.
- Lead efforts in troubleshooting, recovery, and root cause investigation.
- Perform analysis of user requirements and problems to automate or improve systems and review system capabilities, workflow, and scheduling limitations.
- Able to follow and develop detailed work plans, schedules, project estimates, resource plans, and status reports.
- Facilitate HA (High Availability) / DR (Disaster Recovery) exercises to ensure that the team is fully prepared for any event.
- Lead root cause analysis sessions to understand what causes issues in Production and come up with RCA Report along with solutions that will prevent them from happening in the future.
- Ensure documentation is created and remains updated for any related work.
- Strong understanding of UNIX operating systems and any scripting language.
- Forecast and plan for a rapidly growing environment.
- Evaluate new software product and service solutions.
Skill Requirements:QL API support.
Git, Gitlab, Docker, Postman, Splunk, App Dynamics, Imperva WAF and CI/CD tools.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).