More jobs:
Lead Infrastructure Engineer- Infrastructure Monitoring
Job in
Wilmington, New Castle County, Delaware, 19801, USA
Listed on 2026-06-02
Listing for:
JPMorgan Chase & Co.
Full Time
position Listed on 2026-06-02
Job specializations:
-
IT/Tech
Systems Engineer, Cybersecurity, IT Support, Cloud Computing
Job Description & How to Apply Below
Category:
Infrastructure Engineering
Job Schedule:
Full time
Posted Date: T13:32:39+00:00
Job Shift:
:
We have an exciting opportunity for you to collaborate with passionate professionals, solve complex problems, and grow your career in a supportive, innovative environment.
As a Lead Infrastructure Engineer at JPMorgan Chase within Corporate Technology's Enterprise Observability Platforms, you will help build and operate a strategic, market-leading Infrastructure Monitoring platform that strengthens critical service resilience and delivers trusted operational insights. You will be a hands-on technical contributor on an high-performing agile team, building secure, stable, and scalable observability solutions-turning telemetry into actionable insights, modernizing event-to-incident workflows, enabling automation and AIOps-driven reliability improvements aligned to the firm's business objectives.
Job responsibilities
* Engineer, operate, and continuously improve the firm's Infrastructure Monitoring platforms, ensuring availability, performance, scalability, and security.
* Build and run enterprise-grade Infrastructure Monitoring capabilities across Linux, Windows, and complex Network estates, including platform-level onboarding and lifecycle management.
* Design and implement platform services, integrations, and telemetry collection across metrics, logs, events, including Open Telemetry collection patterns where applicable.
* Develop and maintain standardized onboarding patterns (agents/collectors, configurations, dashboards, alert policies) to accelerate safe adoption at scale.
* Improve monitoring signal quality and usability through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment.
* Develop secure, high-quality automation and production code; review, debug, and improve code/configuration written by others.
* Automate platform operations and reduce toil through scripting and CI/CD-driven configuration management; implement infrastructure-as-code deployment patterns
* Manage & maintain production health for the monitoring platform: lead triage, perform RCA, and deliver preventative engineering and resilience improvements.
* Partner with infrastructure, application, and SRE teams to align platform capabilities to SLIs/SLOs, operational readiness, and continuous improvement goals.
* Contribute to a culture of diversity, opportunity, inclusion, and respect.
Required qualifications, capabilities, and skills
* Formal training or certification on infrastructure engineering concepts and 5+ years applied experience
* Proficiency with enterprise operating systems (Linux and/or Windows), including administration, troubleshooting, performance analysis, and operational best practices within regulated production environments.
* Proven hands-on experience delivering and operating enterprise-scale Infrastructure Monitoring solutions across Linux, Windows, and/or Network estates
* Solid understanding and hands-on implementation of observability and telemetry concepts, including metrics, logs, and events, with experience using Open Telemetry collection patterns and integrating telemetry into Downstream components
* Proficiency in automation and engineering practices, including scripting and development with Python, Ansible, Power Shell / Bash, and applying CI/CD-driven workflows for controlled, secure, and repeatable change management.
* Well-rounded experience in infrastructure across hardware platforms, operating systems, networking, storage, and databases (MS SQL Server, Oracle, Cassandra), including common deployment patterns, integration architectures, scaling and resiliency considerations, and performance assessment.
* Experience implementing Infrastructure-as-Code (IaC) and configuration management practices using tools such as Terraform, enabling standardized provisioning and scalable, repeatable deployments.
* Hands-on experience operating in hybrid infrastructure environments, including enterprise on-prem platforms and public/private cloud, with familiarity supporting and migrating monitoring capabilities across cloud boundaries.
* Demonstrated ability to improve monitoring signal quality through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment, supporting reliable event-to-incident workflows and operational insights.
* Experience developing, reviewing, debugging, and maintaining secure, high-quality production code and platform configurations, including automation supporting monitoring platforms and platform operations.
Preferred qualifications, capabilities, and skills
* Hands on experience operating one or more enterprise monitoring platforms such as SCOM, Tivoli, SMARTS, IBM Instana, DX Net Ops, ITNM ,Netcool Suite
* Experience with modern observability ecosystems such as Splunk, Dynatrace, Grafana, Prometheus and interoperability patterns for telemetry integration, routing and visualization.
* Experience with Kubernetes (e.g., EKS) for container…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×