×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability engineering; SRE

Job in San Leandro, Alameda County, California, 94579, USA
Listing for: TechDigital Group
Full Time position
Listed on 2025-12-01
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, SRE/Site Reliability, IT Support
Job Description & How to Apply Below
Position: Site Reliability engineering (SRE)

Need SRE candidate with good Java Dev background interested in this role with strong hands-on experience in building dashboards and setting up alerts using Splunk, Grafana and GCL.

Required Qualifications:

  • 10+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 10+ years of experience in Production support/Site Reliability Engineering teams with continued focus on improving Platform health
  • Familiar with Agile or other rapid application development practices
  • Hands-on expertise with Automated testing, Process Automation & building dashboards using APM tools.
  • Experience with distributed (multi-tiered) systems, algorithms, relational databases, and No

    SQL databases.
  • Knowledge & Exposure caching tools (Redis, memcache) or messaging tools such as MQ, Kafka.
  • Must have working knowledge of APM tools such as splunk, GCL, ELK, Grafana, Prometheus etc.
  • Able to create Dashboards using GCL/Splunk/ELK and setup alerts.
  • Working knowledge of CICD is a plus – Source control like Git, Continuous Integration – Jenkins / UCD Release etc.
  • Ability to work with Engineering teams across the ecosystem such as Security, Networking & Infrastructure challenges which can impact platform health & resiliency.
  • Shell Scripting / Dev Ops tools like Ansible with good knowledge of yaml file to write playbooks.
  • Experience with distributed storage technologies like NFS as well as dynamic resource management frameworks PCF, Kubernetes / Open Shift, AWS or Azure.
  • Tech Stack:
    Java/J2EE (Spring, Spring Boot, Python, Shell Scripting, Kafka, Oracle, Mongo

    DB etc.).
  • Able to work on shift duty in a 12/7 support organization.

Job Expectations:

  • You will be a core member of a SRE support team, utilizing the latest technology tools to write code, test cases, working with API specs and automate to maintain the resiliency, performance and availability of Digital Sales & Marketing platforms.
  • Strong & relevant experience in supporting Web/API platforms built using Java/java script Stack (Spring/Spring boot, Java script
    -Angular/react)
  • Proficiency in dealing with Legacy infrastructure along with cloud infrastructure (on prem & 3rd party) such as PCF or Azure.
  • Identifying opportunities to adopt to new technologies while improving efficiency by removing toil and continues to drive efficiency & optimization.
  • Proactive monitoring of app performance through Splunk, App dashboards, App dynamics & Dynatrace etc.
  • Represent Platform engineering teams during production outages and collaborate with engineering teams to resolve production outages. Collaborate with stakeholders across engineering functions to own/derive RCA & work towards permanent resolution.
  • Plan, support, execute and comply with governance programs/processes in support of a strong control environment in your functional area. Leverage process documentation to improve operational controls and identify and remediate process deficiencies.
  • Proactively identify, communicate, mitigate and escalate risk originating from non-compliance of processes, operational errors, and data integrity issues in all applicable processes.
  • Ability to influence SRE practices within and outside teams to enable a strong Dev Ops culture within the organization.
  • Able to work on shift duty in a 12/7 support organization.
  • Responsible for working with Engineering teams to maintain the SLAs & SLOs. Constantly looking out for opportunities to improve platform metrics & communicate the same to stakeholders.
  • Exposure and proficiency in different API styles such as SOAP, REST, Micro services etc.
  • Working knowledge of Unix, Linux and Postman.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary