×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in New York, New York County, New York, 10261, USA
Listing for: Autonomai Recruitment
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Location: New York

Location: New York City

The ideal candidate comes from a top-tier tech environment (FAANG, elite trading, hyperscale infra). They have experience building technology 0→1, owning systems end-to-end, and working close to the metal. They will operate across everything from bare-metal Linux to modern build and observability stacks
.

Overview

Join a core engineering group as an Site Reliability Engineer, designing and scaling Linux platforms that underpin ML/AI-driven trading. You will architect and own reliability for massive simulation, HPC, and production workloads—ensuring ultra-reliable, ultra-fast trading systems.

Key Responsibilities
  • Deploy SRE practices for Linux platforms powering low-latency, high-throughput trading workloads.
  • Architect, optimize, and tune Linux for performance, resilience, and minimal latency.
  • Drive incident response, root cause analysis, and continuous reliability improvement across production systems.
  • Oversee system automation and reproducibility—build, deploy, and fleet-manage bare-metal Linux and containerized stacks.
  • Manage and enhance Kubernetes clusters, network configuration, and large-scale orchestration.
  • Set observability standards; expand monitoring, alerting, and performance metrics across platforms.
  • Analyze networking, kernel-level performance, and distributed systems—solving core challenges in a multi-petabyte, multi-cluster environment.
  • Build Python tools for automation, reliability engineering, and performance analysis.
  • Design highly distributed systems
What You Will Work On
  • Ultra-reliable, high-performance trading infrastructure where every engineering optimization affects performance
  • Next-generation simulation and HPC compute pipelines, supporting ML/AI workflows at scale.
  • Integration and continuous improvement of internal and open-source tools for automation and reliability.
  • Strategic platform direction: shaping foundational systems for critical infrastructure in an elite trading environment.
Team and Culture
  • Small, autonomous Linux SRE team with direct ownership and impact.
  • Collaborative engagement with quants, researchers, and trading experts to deliver robust platforms.
  • A culture built on deep technical ownership, learning, and high standards of performance engineering

Apply now for an informal confidential chat!

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary