Site Reliability Engineer- onsite,NY Job New York New York USA,IT/Tech

Position: Site Reliability Engineer- 5 days onsite New York, NY
Location: New York

Overview

Site Reliability Engineer — We’re looking for a Site Reliability Engineer to own the design and operation of a hybrid infrastructure spanning both cloud and on-premises environments. You’ll work directly with large fleets of IoT devices, build and improve deployment and monitoring tooling, and lead modernization efforts from architecture to production. This is a hands-on role where you’ll partner closely with engineering teams to deliver secure, reliable, and scalable systems while mentoring others and enforcing strong infrastructure best practices.

The company is located in New York, NY and will be 5 days onsite a week.

What You Will Be Doing

Design, build, and operate infrastructure across cloud and on-prem environments with a focus on reliability, security, and scalability.
Develop and maintain tools to deploy, manage, and monitor customer-premise equipment, IoT devices, and backend services.
Lead system architecture and modernization initiatives from design through implementation and ongoing production support.
Improve observability for infrastructure and IoT fleets by building custom tooling and integrating off-the-shelf monitoring solutions.
Implement and harden secure connectivity patterns to ensure resilient, safe device-to-cloud communication.
Mentor teammates and raise the bar on infrastructure, monitoring, and system design practices.
Collaborate with Dev Ops and software teams to tightly integrate infrastructure with CI/CD pipelines for smooth, reliable releases and upgrades.

Required Skills & Experience

5+ years of experience in Infrastructure Engineering, SRE, or Dev Ops roles.
Bachelor’s degree in Computer Science or equivalent experience.
Proven experience managing servers and/or IoT devices at scale.
Strong software development skills for infrastructure tooling, preferably in Python.
Hands-on experience with core AWS services.
Demonstrated ability to build, manage, and secure hybrid (cloud + on-prem) environments.
Solid understanding of Infrastructure as Code (Terraform, Ansible, CDK, etc.) for repeatable automation.
Strong Linux and networking fundamentals, including Docker, firewalls, load balancers, and VPNs.
Deep experience with monitoring, alerting, and troubleshooting large-scale distributed systems.
Strong problem-solving skills and comfort working in fast-paced, production-critical environments.
Excellent communication skills in English, with a collaborative and proactive working style.

Desired Skills & Experience

Experience with AWS IoT services, such as Green grass v2.
Knowledge of secure infrastructure design principles and best practices.
Experience managing edge devices at scale.

Applicants must be currently authorized to work in the United States on a full-time basis now and in the future. This position does not provide sponsorship.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language

Site Reliability Engineer- onsite , NY