More jobs:
Sr Engineer Site Reliability
Job in
Bethpage, Nassau County, New York, 11714, USA
Listed on 2026-06-02
Listing for:
Optimum
Full Time
position Listed on 2026-06-02
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing
Job Description & How to Apply Below
We are Optimum, a leader in the fast-paced world of connectivity, and we're on the hunt for enthusiastic professionals to join our team! We understand that connectivity isn't just a luxury anymore - it's a necessity that empowers lives, fuels businesses, and drives innovation. A career at Optimum means you'll be enabling progress and enhancing lives by providing reliable, high-speed connectivity solutions that keep the world connected.
We owe our success to our amazing product, commitment to our people and the connections we make in every community.
If you are resourceful, collaborative, team-oriented and passionate about delivering consistent excellence, Optimum is the Company for you!
We are Optimum!
Job Summary
As a Site Reliability Engineer III, you will be a primary driver in the long-term management and stabilization of our Hybrid Cloud infrastructure. We maintain a permanent dual-hosting strategy, operating both Google Cloud Platform (GCP) and mission-critical On-Premises Unix/Linux footprint. You will bridge the gap between physical hardware and modern cloud-native operations, applying software engineering principles to ensure our systems are scalable, secure, and predictable across all platforms.
The Mission: Hybrid Reliability & Stabilization
Your mission is to unify our GCP and On-Premises environments into a single, reliable platform. Your first 12 months will focus on Stabilization and Observability. You will lead the transition away from "toil" (manual, repetitive operations) toward high-leverage automation, aggressively addressing on-prem technical debt while implementing modern SRE practices across our global data centers and cloud projects.
Responsibilities
- Hybrid Platform Standardization: Audit, harden, and standardize Unix (Solaris/AIX) and Linux (RHEL/Ubuntu) environments across both GCP Compute Engine and physical bare-metal servers.
- Infrastructure Stewardship (DC Support): Serve as the engineering lead for our Eastern U.S. data centers; ensure hardware health, power redundancy, and physical security standards are enforced through code and automated checks.
- Storage Engineering (Specialization): Architect and manage enterprise-grade SAN/NAS environments alongside GCP Cloud Storage/Persistent Disk. Optimize for low latency and high IOPS while ensuring all data-at-rest complies with our Annual Encryption Strategy.
- Automation of Toil: Design and maintain robust automation pipelines (Ansible, Terraform, Python) to ensure configuration parity and eliminate drift between cloud and on-premises environments.
- Vulnerability Management: Transition the fleet from a "vulnerable" state to a "reliable" one by establishing a sustainable, automated monthly patching cadence.
Unified Observability:
Implement and scale a "single pane of glass" monitoring stack (Prometheus, Grafana, Loki) to provide real-time health metrics for the entire hybrid estate. - Incident Response & Post-Mortems: Participate in a sustainable on-call rotation. Lead Blameless Post-Mortems for incidents involving cross-platform dependencies to ensure we "fix the system, not the person."
Qualifications
Technical Requirements (SRE3)
- OS Internals: Deep proficiency in Linux (RHEL/Ubuntu) and Unix (Solaris/AIX) administration and kernel tuning
- Cloud Proficiency: Hands-on experience with GCP (IAM, VPC, Compute Engine) or equivalent public cloud providers
- Infrastructure as Code: Proven ability to manage complex environments using Terraform and Ansible
- Storage Protocols: Proficiency in Fiber Channel, iSCSI, and NFS. Experience with enterprise arrays (Net App, Dell/EMC, or Pure Storage) is highly preferred
- Software Engineering: Strong scripting ability in Python or Go to build internal tools and automation.
Security:
Strong understanding of CVE life cycles and cryptographic standards (AES-256)
- Bachelor's degree in Telecommunications, Computer Engineering, or related discipline
- 6+ years of experience in IP networking and infrastructure support, with at least 4 years in reliability-focused roles
Taking Ownership, Upholding Transparency, Creating Community, and Demonstrating Expertise.
Our commitment to empowering employees to take responsibility and embrace proactive problem-solving underpins Taking Ownership. Upholding Transparency is at the core of our culture, with open and honest communication fostering trust among our dedicated team and loyal customers. Creating Community is more than a goal; it's our daily commitment to fostering an environment of collaboration, innovation, and positivity. Demonstrating expertise is a promise we uphold through continuous learning and engagement with our customers to consistently deliver top-quality products and services.
These pillars not only shape our culture but define Optimum as a place of excellence, trustworthiness, and thriving community, and we invite you to be a…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×