More jobs:
Job Description & How to Apply Below
The Role
Site Reliability Engineer is one of the critical role in the technology team and the person working in this team will be responsible for application performance, availability, reliability and system uptime. Candidate is responsible to provide consultation and strategic recommendations by quickly assessing and remediating complex platform availability issues. Site Reliability Engineer LEAD will dive head-first into creating or applying innovative solutions and techniques that advance the reliability of Digital products.
Key Responsibilities
Installation/deployment of new releases , environments for applications.
Build and maintain highly scalable, large scale deployments globally
Co-Create and maintain architecture for 100% uptime.
E.g. creating alternate connectivity.
Practice sustainable incident response/management and blameless post-mortems.
Monitor and maintain production environment stability.
Own entire platforms (prod environments) Deploying, automating, maintaining and managing production systems, to ensure the availability, performance, scalability and security of productions systems
Engage in and improve the whole lifecycle of services from inception and design, through deployment, operation and refinement.
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
Collaborate with Agile teams in defining technical requirements and best practices with containerized and cloud-native applications
Represent production support and site reliability in stand-ups, planning sessions, code reviews, and architecture reviews
Help evolve our configuration management (CM) efforts and our move to containers
Help the operations head in selecting the enthusiastic and technically knowledgeable team and guide the existing team members.
Skills Required
Should have good knowhow of application, middleware, Databases (posgres, mongo, mysql etc.), infra, OS.
Should have good understanding in Docker and Kubernetes.
Should have an understanding of CI/CD and Dev Ops tools like Jenkins, Ansible, Shell scripting etc
Monitoring and Logging:
Experience with monitoring and logging tools (e.g. Nagios / appdynamics, ELK, Prometheus).
Good
Experience of distributed systems Rabbit
MQ, Kafka, Redis etc.
Should have an experience of working on Linux, Weblogic/tomcat, Jboss and middleware technology.
Should have worked on high traffic & highly scalable systems in past
Knowledge on fundamental aspects for release automation (packaging, dependencies, promotion, deployment, compliance)
Experience on project management tools such as JIRA and insight on quality analysis as well #BAL
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×