×
Register Here to Apply for Jobs or Post Jobs. X

Observability Engineer

Job in Cleveland, Cuyahoga County, Ohio, 44101, USA
Listing for: Bayforce
Full Time position
Listed on 2026-06-05
Job specializations:
  • IT/Tech
    Systems Engineer, IT Support
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Duration: 6 Months (Potential Extension)

Location: Cleveland, OH area – Hybrid (4 days onsite / 1 day remote)

About the Role

We are seeking an experienced Observability Engineer to support and expand a centralized enterprise observability platform. This initiative is focused on building a true “single pane of glass” monitoring environment using modern telemetry and monitoring technologies including Prometheus, Grafana, and Loki.

The current environment captures approximately 50% of server telemetry and is now evolving to include cross-domain observability across infrastructure, applications, databases, storage, and business transaction data. Long-term goals include enabling AI/ML-driven anomaly detection and intelligent root-cause analysis.

This is an opportunity to play a key role in building an enterprise-wide operational intelligence platform.

Responsibilities
  • Expand telemetry ingestion across infrastructure, databases, storage platforms, applications, and network environments
  • Assist with onboarding remaining systems and extending monitoring beyond traditional OS metrics
  • Build and enhance Grafana dashboards that correlate infrastructure health with application performance and business transaction metrics
  • Develop and maintain synthetic monitoring scripts using Playwright or similar tools to simulate critical user journeys
  • Configure and optimize alerting workflows using Alert manager and Loki
  • Improve signal-to-noise ratio and reduce alert fatigue through better event management practices
  • Establish and maintain telemetry labeling standards and data quality practices
  • Support troubleshooting, root-cause analysis, and operational documentation efforts
  • Partner with engineering and infrastructure teams to drive observability best practices across the enterprise
Required Qualifications
  • Hands-on experience with:
    • Grafana
    • Loki
    • Alert manager
    • Strong experience writing PromQL queries and building Grafana dashboards
  • Experience designing or supporting enterprise observability and monitoring platforms
  • Ability to collect and normalize telemetry across:
    • Servers
    • Databases
    • Networks
    • Applications
  • Experience with synthetic monitoring tools such as Playwright or Selenium
  • Experience editing and managing YAML and JSON configuration files

    Knowledge of alert routing, escalation workflows, and reducing alert fatigue
  • Understanding of telemetry standards, labeling strategy, and data hygiene practices
  • Strong troubleshooting and analytical skills
Preferred Qualifications
  • Oracle and SQL database experience
  • Experience with SNMP, network flow data, or infrastructure performance monitoring
  • Exposure to AI/ML-based observability or anomaly detection initiatives

This role offers the opportunity to help shape the future of enterprise monitoring and observability while working on high-impact initiatives supporting large-scale infrastructure and application environments.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary