More jobs:
Job Description & How to Apply Below
Experience
Minimum 6+ years of hands-on experience with Application Performance Management tools such as Datadog, New Relic, App Dynamics, Dynatrace, Splunk ITSI, Honeycomb, Chronosphere, Riverbed Aternity/Alluvio, Extra Hop, & Logic Monitor
Hands-on experience with cloud-native, open-source solutions like Prometheus, Grafana, ELK stack/Elastic.io, Open Telemetry (OTEL)
Experience with public cloud solutions like AWS Cloud Watch, Azure App Insights, etc.
Strong understanding of network & system management solutions, distributed systems, networking, and database technologies
Operational background and familiarity with ITIL, ITSM, SRE, or Dev Ops best practices and principles
Excellent problem-solving skills, organizational, project management, and communication skills
Eagerness to collaborate, contribute to team success, and a continuous learning mindset
Experience with containerization and orchestration technologies like Docker and Kubernetes
Broad background in software engineering with, at a minimum, generalist-level expertise in programming languages such as Python, Java, Go, .NET, NodeJS, Ruby, and PHP
Familiarity with microservices architecture, service mesh technologies, and end-user technologies (iOS, Android, JavaScript, HTML5)
Knowledge of configuration management tools such as Terraform and Ansible
Roles and Responsibilities
Implement and maintain cutting-edge Observability solutions utilizing tools like New Relic, Datadog, App Dynamics, or Dynatrace for our large-scale enterprise customers
Develop and maintain systems for effective monitoring, logging, and tracing, ensuring scalability and reliability
Collaborate with cross-functional teams, including software engineers, product managers, and data scientists, to build resilient systems
Integrate observability practices into different engineering workflows and lead the adoption, optimization, and integration of products within the customers business infrastructure
Create custom dashboards, set up alerts, and develop AIOps rules, ensuring effective tracking against goals/KPIs
Provide technical support in post-sales processes, including installation, deployment, training, technical check-ups, and escalation management
Identify performance bottlenecks and anomalous system behavior and resolve root causes of service issues
Stay updated with the latest trends in observability, logging, monitoring, and cloud technologies and introduce innovative solutions and best practices
Participate in strategic technology planning, focusing on scalability, cost-effectiveness, and risk management in observability infrastructure
Document observability systems and processes comprehensively and prepare reports for management on system performance and reliability
Utilize Infrastructure as Code (IaC) principles for efficient infrastructure provisioning and management
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×