Sr Observability Engineer
Listed on 2026-02-12
-
IT/Tech
IT Support, Cloud Computing, Systems Engineer, Cybersecurity
Software Guidance & Assistance, Inc., (SGA), is searching for a Sr Observability Engineer for a CONTRACT assignment with one of our premier Insurance services clients in Holmdel, NJ or Bethlehem, PA
.
- Administer and configure Splunk, App Dynamics, OTEL and Zenoss platforms to meet organizational monitoring needs.
- Perform regular updates, patches, and upgrades to observability tools to ensure they are up-to-date and secure.
- Continuously monitor the health and performance of the Splunk, APPD and Zenoss systems.
- Ensure data integrity and availability within the observability platforms.
- Provide support to internal users, assisting with troubleshooting and resolving issues.
- Develop and deliver training sessions for users to effectively utilize the monitoring tools.
- Create and manage dashboards, reports, and alerts.
- Work with stakeholders to define monitoring requirements and implement appropriate alerting mechanisms.
- Manage the onboarding, alert creation.
- Optimize system performance by tuning configurations and managing resource utilization.
- Maintain comprehensive documentation of configurations, processes, and procedures.
- Develop and enforce best practices for monitoring and observability within the organization.
- Collaborate with IT and Dev Ops teams to ensure comprehensive monitoring coverage.
- Participate in incident response efforts, using observability data to assist in troubleshooting and resolution.
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- Minimum of 5-7 years in Observability/Monitoring/Site reliability engineering with a focus on Splunk, App Dynamics and Zenoss.
- Proven experience in Implementing, Managing and Maintaining observability tools.
- Proficiency in Splunk and App Dynamics (including configuration, administration, and implementation).
- Proficiency in Zenoss (including setup, configuration, and maintenance).
- Strong in MELT, Metrics, Events, Logs and Traces; hands‑on troubleshooting and support.
- Open Telemetry: instrumentation patterns, context propagation, collectors, sampling etc.
- Maintain platform reliability, upgrades, patching, and security hardening.
- Exposure to Kubernetes observability (cluster/workload metrics, events, service discovery).
- Strong knowledge of IT infrastructure, applications, and networking.
- Experience with scripting and automation tools (e.g., Python, Bash).
- Familiarity with cloud environments (e.g., AWS, Azure) is required.
- Excellent problem‑solving and analytical skills.
- Strong communication and collaboration abilities.
- Ability to work independently and in a team‑oriented environment.
- Experience with other monitoring and observability tools (e.g., Prometheus, Grafana).
- Knowledge of Dev Ops practices and CI/CD pipelines.
- Hands‑on Infrastructure-as-Code (Terraform/Ansible) and Git‑based workflows.
SGA is a technology and resource solutions provider driven to stand out. We are a women‑owned business. Our mission: to solve big IT problems with a more personal, boutique approach. Each year, we match consultants like you to more than 1,000 engagements. When we say let's work better together, we mean it. You'll join a diverse team built on these core values: customer service, employee development, and quality and integrity in everything we do.
Be yourself, love what you do and find your passion ase find us at
SGA is an Equal Opportunity Employer and does not discriminate on the basis of Race, Color, Sex, Sexual Orientation, Gender Identity, Religion, National Origin, Disability, Veteran Status, Age, Marital Status, Pregnancy, Genetic Information, or Other Legally Protected Status. We are committed to providing access, equal opportunity, and reasonable accommodation for individuals with disabilities in employment, and our services, programs, and activities.
Please visit our company EEO page to request an accommodation or assistance regarding our policy.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).