More jobs:
Sr Observability Engineer
Job in
Holmdel Township, Monmouth County, New Jersey, USA
Listed on 2026-02-07
Listing for:
Perennial Resources International
Full Time
position Listed on 2026-02-07
Job specializations:
-
IT/Tech
IT Support, Cloud Computing
Job Description & How to Apply Below
Overview
We are seeking a dedicated and detail oriented Senior Observability Engineer with expertise in Splunk, App Dynamics, Open Telemetry and Zenoss to join our Enterprise Observability Engineering team. The ideal candidate will be responsible for the administration, configuration, and maintenance of our observability tools to ensure optimal performance and reliability of our IT systems.
Key Responsibilities- Administration and Implementation
- Administer and configure Splunk, App Dynamics, OTEL and Zenoss platforms to meet organizational monitoring needs.
- Perform regular updates, patches, and upgrades to observability tools to ensure they are up-to-date and secure.
- Monitoring and Maintenance
- Continuously monitor the health and performance of the Splunk, APPD and Zenoss systems.
- Ensure data integrity and availability within the observability platforms.
- User Support and Training
- Provide support to internal users, assisting with troubleshooting and resolving issues.
- Develop and deliver training sessions for users to effectively utilize the monitoring tools.
- Dashboard and Alert Management
- Create and manage dashboards, reports, and alerts
- Work with stakeholders to define monitoring requirements and implement appropriate alerting mechanisms.
- Data Management and Optimization
- Manage the onboarding, Alert creation.
- Optimize system performance by tuning configurations and managing resource utilization.
- Documentation and Best Practices
- Maintain comprehensive documentation of configurations, processes, and procedures.
- Develop and enforce best practices for monitoring and observability within the organization.
- Collaboration and Incident Response
- Collaborate with IT and Dev Ops teams to ensure comprehensive monitoring coverage.
- Participate in incident response efforts, using observability data to assist in troubleshooting and resolution
- Bachelor’s degree in Computer Science, Information Technology, or a related field.
- Minimum of 5–7 years in Observability/Monitoring/Site reliability engineering with a focus on Splunk, App Dynamics and Zenoss.
- Proven experience in Implementing, Managing and Maintaining observability tools.
- Proficiency in Splunk and App Dynamics (including configuration, administration, and implementation).
- Proficiency in Zenoss (including setup, configuration, and maintenance).
- Strong in MELT, Metrics, Events, Logs and Traces; hands-on troubleshooting and support
- Open Telemetry: instrumentation patterns, context propagation, collectors, sampling etc
- Maintain platform reliability, upgrades, patching, and security hardening
- Exposure to Kubernetes observability (cluster/workload metrics, events, service discovery)
- Strong knowledge of IT infrastructure, applications, and networking.
- Experience with scripting and automation tools (e.g., Python, Bash).
- Familiarity with cloud environments (e.g., AWS, Azure) is required.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities.
- Ability to work independently and in a team-oriented environment.
- Experience with other monitoring and observability tools (e.g., Prometheus, Grafana).
- Knowledge of Dev Ops practices and CI/CD pipelines.
- Hands-on Infrastructure-as-Code (Terraform/Ansible) and Git-based workflows
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×