More jobs:
Job Description & How to Apply Below
In this role, you will play a crucial part in shaping the firm's infrastructure reliability and efficiency by implementing robust Site Reliability Engineering practices. Your contribution will be pivotal in ensuring the availability, scalability, and performance of our systems and applications. Leveraging your strong technical skills and expertise in Dev Ops principles, you will work towards enhancing the reliability of our infrastructure and minimizing downtime, thus enabling the organization to deliver high-quality software with maximum efficiency
EXPERIENCE AND REQUIRED SKILL SETS
Ensure 24
* 7 uptime and stability of production systems
Investigate and troubleshoot production issues
Collaborate with developers to optimize system performance
Participate in on-call rotation to provide 24/7 support for critical systems
Work on automation and enhancements to reduce manual processes / intervention.
Relevant 5+ years of experience in SRE / Production/Product Support role, with a track record of implementing SRE practices
Basic understanding of cloud solutions provided by providers such as AWS or Azure.
Basic-Intermediate knowledge of Scripting in either of Bash/Python/Power Shell.
Good presentation, communication and interpersonal skills with the ability to collaborate effectively with cross-functional teams and stakeholders across different countries and cultures.
Good problem solving and troubleshooting skills
Continuous learning mindset and willingness to adapt to new technologies and industry trends.
Good Understanding of Operating System Commands (Linux),SQL (Ability to write, analyze queries and deduce / build important information per requirement)
In-depth knowledge of Trading Life Cycle:
The candidate should possess comprehensive understanding of trading life cycle, including order management, trade execution, settlement and post-trade processes. Familiarity with various financial products like Equities, Derivatives, Currencies, Commodities, FX is a plus.
Incident and Problem Management Expertise:
The candidate must demonstrate strong problem-solving skills and the ability to manage incidents frequently and efficiently within a fast paced trading environment. This includes identifying, analyzing and resolving issues related to trading systems and processes as well as collaborating with cross-functional teams to implement long-term solutions and improve operational efficiency.
Good Understanding of Tools
Orchestration – Autosys / Airflow or Cron
Monitoring &Logging – Pager Duty, Prometheus & Grafana or Datadog, Splunk
Project Management / ITSM – Service Now (Basic ability to navigate / create change tickets / incidents) , Jira (Basic ability to create Jira Tickets , ability to filter your work)
EDUCATION
Bachelor’s degree or master’s in computer science, Engineering, Software Engineering or a relevant field
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×