Site Reliability Engineer II
Listed on 2026-02-16
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Support
Overview
Restaurant
365 is a SaaS company disrupting the restaurant industry! Our cloud-based platform provides a unique, centralized solution for accounting and back-office operations for restaurants. Restaurant
365’s culture is focused on empowering team members to produce top-notch results while elevating their skills. We’re constantly evolving and improving to make sure we are and always will be “Best in Class” ... and we want that for you too!
The Site Reliability Engineer II will be responsible for supporting, enhancing, and maintaining Restaurant
365’s cloud infrastructure and applications. Qualified candidates will demonstrate growing expertise in site reliability practices, with skills in incident response, system monitoring, automation, and performance troubleshooting. You will collaborate with Dev Ops, development, and infrastructure teams to resolve moderately complex issues, propose improvements, and strengthen the reliability, scalability, and security of our SaaS platform.
- Execution & Collaboration
- Respond to production incidents, perform triage and troubleshooting, and contribute to post-incident analysis.
- Identify and automate manual processes to improve efficiency and reduce risk.
- Enhance and evolve monitoring tools and platforms to improve observability.
- Promote and apply best practices for reliability, scalability, and performance across engineering.
- Implement and support cloud automation using Terraform, Ansible, or Cloud Formation.
- Work within change management protocols to provide maximum uptime for production systems.
- Participate in on-call rotation, providing 24x7 support for incidents and contributing to root cause analysis.
- Partner with developers, architects, vendors, and IT teams to ensure reliable system operations.
- Research and remediate vulnerabilities in coordination with security teams.
- Maintain documentation of infrastructure, monitoring, runbooks, and incident response procedures.
- Standards & Process
- Apply company policies and procedures when handling operational tasks and incidents.
- Sugge st and implement improvements to operational processes and monitoring practices.
- Contribute to technical diagrams, documentation, and runbooks for system reliability.
- Learning & Growth
- Expandexpertise in cloud services (Azure, AWS, or GCP) and container platforms (EKS, ECS, AKS).
- Buildproficiency with observability and monitoring tools (Prometheus, Grafana, ELK, Site
24x7, Nagios). - Develop scripting and automation skills using Python, Bash, Power Shell, or similar.
- Participate in planning discussions by contributing technical input on system stability and reliability.
- BS in Computer Science, Information Systems, or related field (or equivalent experience).
- 2–4 years of experience in site reliability engineering, Dev Ops, or cloud operations.
- Experience with cloud platforms (Azure or AWS), including services such as AKS, ECS/EKS, Functions/Lambda, S3, and Blob storage.
- Proficiency with infrastructure-as-code and automation (Terraform, Ansible, YAML, Python, Bash, Power Shell).
- Strong Linux engineering skills; working knowledge of Windows administration.
- Experience supporting production environments and participating in on-call rotations.
- Familiarity with web servers and middleware (Nginx, Apache Tomcat).
- Experience with CI/CD tools (Git Lab, Git, or similar).
- Strong written, oral, and interpersonal communication skills.
- Experience with monitoring tools (Prometheus, Grafana, ELK, Site
24x7, Nagios). - Knowledge of performance analysis and system vulnerability remediation.
- Cloud certification (AWS or Azure) preferred.
- Familiarity with restaurant industry SaaS platforms and customer-facing applications.
- This position has a salary range of $98,583-$138,016 annually. The above range represents the expected salary range for this position. The actual salary may vary based upon several factors, including, but not limited to, relevant skills/experience, time in the role, business line, and geographic location. Restaurant
365 focuses on equitable pay for our team and aims for transparency with our pay practices. - Comprehensive medical benefits, 100% paid for employee
- 401k + matching
- Equity Option Grant
- Unlimited PTO + Company holidays
- Wellness initiatives
#BI-Remote
DYN
365, Inc d/b/a Restaurant
365 is an equal opportunity employer.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).