Senior DevOps Engineer/AWS_Hybrid; NYC
Listed on 2026-06-02
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability, IT Project Manager
Location: New York
We are looking for a Senior Dev Ops Engineer to own the infrastructure that keeps a fast-growing applied AI and data analytics platform running at enterprise scale. You will architect, build, and maintain the cloud systems that Fortune 500 retailers trust with their most critical data and decision‑making. This is a hands‑on, high‑ownership role – you will work closely with backend and ML engineers to ensure the platform is secure, scalable, and relentlessly reliable.
A key part of the role is taking over infrastructure leadership from the current Dev Ops lead over a transition period, then owning that domain long‑term. Over time, you will also be involved in growing the Dev Ops team, though the immediate priority is hands‑on technical execution.
- Schedule:
Full-time - Location:
Hybrid (NYC) - Salary: $160K – $200K
- Type of collaboration:
Full-time employment
The platform connects an organization's entire data landscape – internal systems, social media trends, industry reports, consumer behavior signals – into a single coherent intelligence layer. It drives 8‑figure improvements in gross margins for Fortune 500 retailers. As enterprise onboarding ramps up, the infrastructure must scale to match. The team believes in Git Ops, infrastructure as code, and building systems that let engineers move fast without breaking things.
You will be the person who makes that real – standing up infrastructure, defining reliability targets, automating manual processes, and owning production incidents end‑to‑end. Former technical founders are a strong fit: what you've built matters more than tenure.
- 4–7+ years of industry experience, with at least 4 years in hands‑on Dev Ops/infrastructure roles
- 4+ years managing cloud infrastructure in production – AWS strongly preferred
- 2+ years of production Kubernetes experience (EKS preferred)
- Deep proficiency in Terraform/Terragrunt and infrastructure‑as‑code principles
- Strong scripting skills in Python and Bash – automates everything possible
- Experience with CI/CD platforms (Git Hub Actions, Git Lab CI, Jenkins, or similar)
- Security‑first mindset with experience implementing compliance frameworks (SOC 2, GDPR)
- Systems‑level, architectural, and strategic thinking – can articulate reliability targets, scaling policies, and rollback criteria regardless of specific tooling
- Experience supporting ML/AI workloads and GPU infrastructure
- Familiarity with service mesh architectures and advanced networking
- Experience with multi‑tenant enterprise SaaS infrastructure
- Background in cost optimization and Fin Ops practices
- Architect and build secure, scalable cloud infrastructure using IaC (Terraform/Terragrunt) on AWS
- Design and maintain robust CI/CD pipelines; convert manual processes into automated, repeatable workflows
- Own production reliability – set up observability stacks, define SLI/SLOs, and lead incident response
- Implement and champion Git Ops workflows for all infrastructure deployments
- Build compliance‑ready infrastructure with IAM best practices and secrets management
- Optimize cloud costs while maintaining performance and reliability at scale
- Create developer tooling and documentation that accelerates engineering workflows across the team
- Mentor team members on Dev Ops best practices and guide infrastructure strategy
- Take over infrastructure leadership over a structured transition period; own this domain long‑term
- Application reviewed by a human
- Intro call with the recruiting team
- Technical conversation with the infrastructure lead
- On‑site – meet the team
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).