×
Register Here to Apply for Jobs or Post Jobs. X

Manager - SRE

Job in Hackensack, Bergen County, New Jersey, 07601, USA
Listing for: Mphasis
Full Time position
Listed on 2026-05-17
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Role description

Job Description

Job Description

Role: Manager SRE

Automation Lead – Leading Automation SRE, Responsible to perform end to end Self‑Healing automation solution to reduce manual effort/TOIL.

Primary Skill – Observability, Telemetry and event co‑reliation

Secondary Skill – Shell Script, Linux, Monitoring tools - Big Panda – Splunk, AppD etc.

Automation Engineer
  • 15+ years of experience in leading Automation SRE teams
    .
  • Advanced working experience with two or more of the following:
    Unix/Linux, Windows Server, Oracle, MSSQL, Mongo

    DB.
  • Experience with Python, Java, Curl scripting or any other types of scripting.
  • Experience with two or more of the following observability tools:
    App Dynamics, Big Panda, Elastic Search (ELK), Google Cloud Logging, Grafana, Prometheus, Splunk, Thousand Eyes.
  • Experience with logging, monitoring, and event detection on Cloud or Distributed platforms.
  • Experience working with one or more of the following:
    Auto Sys, CRON, Windows Scheduler or other logical batch schedulers.
  • Provides technical direction regarding monitoring and logging to less experienced staff or develops highly complex original solutions. Acts as an Expert technical resource for modeling, simulation and analysis efforts.
  • Experience creating and modifying technical documentation such as environment flow, functional requirements, nonfunctional requirements.
  • Outstanding problem solving and analytical skills with ability to turn findings into strategic imperatives.
  • Technical operations application support experience.
  • Minimum 4-6 years of hands‑on experience into SRE implementation of monitoring system development for application reliability using Splunk, Grafana, App Dynamics, Big panda.
  • Completely On‑Prim environment, so we would require strong candidates on the above skills.
  • Overall, we are looking for an Automation Engineer, who could reduce the toil issues and enhance the system towards reliability and scalability.
Nature of the Job
  • Collaborate with Production support team, identify the existing manual activities, and automate.
  • Identify toil area where it can be automated to avoid manual intervention.
  • Build Monitoring system and observability platform for more Stack traces and dashboards.
  • Ability to define SLA, SLO and SLI and implement the same for better monitoring.
  • Scalability, reliability, and observability are the primary goals for reduction of MTTD and MTTR.
  • Other details

    Deputation Location : US – New Jersey – New Jersey

    #J-18808-Ljbffr
    To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
    (If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
     
     
     
    Search for further Jobs Here:
    (Try combinations for better Results! Or enter less keywords for broader Results)
    Location
    Increase/decrease your Search Radius (miles)
    0
    200
    Filters
    Education Level
    Experience Level (years)
    Posted in last:
    Salary