×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in New York, New York County, New York, 10261, USA
Listing for: Berkley Hunt
Full Time position
Listed on 2026-02-15
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Location: New York

Berkley Hunt has partnered with a high-growth fintech company to hire a Site Reliability Engineer to help build, operate, and scale a globally distributed, highly available cloud platform. This role focuses on reliability, automation, and operational excellence, working closely with engineering teams to ensure systems are resilient, scalable, and production-ready from day one.

Hybrid In Manhattan Who You Are:
  • You think in systems, not silos, you naturally connect infrastructure decisions to customer experience and business impact.
  • You have strong experience running production environments at scale and understand what “good” looks like in terms of uptime, latency, and reliability.
  • You’re confident operating Kubernetes in real-world production settings, not just deploying to it.
  • You have a solid background in cloud architecture across AWS and GCP, and understand the trade-offs of distributed systems.
  • You are proactive about identifying risk and eliminating single points of failure before they become incidents.
  • You are comfortable working in fast-paced environments where priorities evolve and ownership is shared.
  • You believe infrastructure should be repeatable, observable, and continuously improving.
Responsibilities:
  • Architect and evolve cloud infrastructure to support a secure, highly available, and globally distributed fintech platform.
  • Embed reliability best practices into the development lifecycle, influencing design decisions before code reaches production.
  • Drive improvements in deployment workflows through Git Ops and Infrastructure-as-Code methodologies.
  • Enhance system visibility by building robust monitoring, logging, and alerting frameworks.
  • Lead incident response efforts, conduct post-incident reviews, and implement preventative measures to strengthen platform resilience.
  • Continuously refine Kubernetes environments to improve performance, scalability, and operational efficiency.
  • Partner cross-functionally with engineering and product teams to balance speed of delivery with operational stability.
  • Reduce operational toil by identifying automation opportunities and improving internal tooling.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary