×
Register Here to Apply for Jobs or Post Jobs. X

Senior Observability Engineer

Job in Irvine, Orange County, California, 92713, USA
Listing for: Ensono
Full Time position
Listed on 2026-06-18
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing: Infrastructure & Operations, Cybersecurity, IT Support
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

About the role and what you’ll be doing:

As a Senior Observability Engineer, you will be responsible for designing, implementing, migrating and optimizing end‑to‑end monitoring and observability solutions to ensure the reliability, performance and resilience of distributed systems and application services for our clients. You will play a critical role in advancing observability maturity across complex legacy, distributed and hybrid environments by enabling proactive detection, rapid diagnosis, efficient resolution of incidents and the use of AI capabilities.

What You Will Do:
  • Assess the current state of monitoring and observability across applications and systems, including identifying alert fatigue, monitoring gaps, and coverage deficiencies.
  • Define and execute strategies to incrementally improve the monitoring and observability maturity of platforms, applications, and infrastructure.
  • Design and implement end‑to‑end observability solutions that provide comprehensive visibility into business transactions, service dependencies, and underlying technical components.
  • Establish and promote monitoring best practices focused on noise reduction, controlled metric cardinality, and the prevention of duplicate or redundant telemetry.
  • Define and implement automated alerting strategies aligned with Service Level Objectives (SLOs) and Service Level Agreements (SLAs) to ensure actionable and meaningful alerts.
  • Develop and enforce monitoring audit standards to support governance, compliance, and regulatory requirements.
  • Act as an escalation point for complex or critical monitoring‑related incidents and provide strategic guidance and recommendations to engineering and operations teams.
  • Automate monitoring configurations, policy management, and telemetry collection using CI/CD pipelines and Infrastructure as Code (IaC) practices with tools such as Helm, Ansible, and Terraform.
  • Build reusable automation frameworks and standardized reporting solutions to support consistent monitoring rollouts, configuration management, and operational insights.
  • Leverage AI and machine learning techniques to enhance observability outcomes, including intelligent anomaly detection, alert noise reduction, predictive incident identification, automated root‑cause analysis, and data‑driven insights to improve service reliability and operational efficiency.
Required Qualifications:
  • Overall 10+ years of experience out of which, 7+ years of solid experience with APM, monitoring, observability and event management tools including Dynatrace/App Dynamics, Splunk, Cortex, Prometheus, Grafana, and Netcool.
  • Experience with ITSM, ticketing tools and their integration with monitoring tools.
  • Proficiency in Application Workloads (Binary, Java, Python, .NET, Batch Jobs).
  • Experience in Python, Bash, Power Shell or JavaScript for automation of tasks.
  • Exposure to CI/CD pipelines and IaC (Infrastructure as Code).
  • Strong analytical and problem‑solving skills for diagnosing complex issues.
  • Effective communication, individual leadership, and cross‑functional team collaboration.
  • Ability to think outside the box, sensitivity towards business impacts, and self‑awareness to refine processes.
  • Bachelor’s degree in computer science or engineering field.
Preferred Qualifications:
  • Proficiency in broader aspects of monitoring and observability (APM, System Monitoring, Logs, Tracing, Visualization, Reporting and Integration).
  • Experience in automation/programming/coding to an extent that can instrument monitoring solutions for a given platform/tooling/practice.
  • Certified professional in Dynatrace/App Dynamics, Splunk, ITIL or AI.
Some of our benefits include:
  • Unlimited Paid Days Off
  • Three health plan options
  • 401k with company match
  • Eligibility for dental, vision, short and long‑term disability, life and AD&D coverage, and flexible spending accounts
  • Family Forming Benefit including fertility coverage and adoption/surrogacy reimbursement
  • Paid childbearing and paternal leave
  • Education Reimbursement, Student Loan Assistance or 529 College Funding
  • Sabbatical leave
  • Wellness program
  • Flexible work schedule

As of the date of this posting, a good‑faith estimate of the current pay scale for this role is  125,000…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary