Senior Software Engineer/SRE - Trade Automation & Execution , NY Posted
Listed on 2026-02-05
-
IT/Tech
Systems Engineer, Cloud Computing
Location: New York
Senior Software Engineer / SRE - Trade Automation & Execution
LocationNew York
Business AreaEngineering and CTO
#Description & RequirementsThe Trade Automation & Execution (TRAX) group builds the platforms and services that power modern electronic trading design and operate high-performance, distributed, real-time systems used by financial institutions worldwide to execute trades, automate workflows, and make data-driven decisions. As markets evolve toward automation, scale, and intelligence, ensuring these platforms remain scalable, resilient, and predictable is critical. This is where TRAX Reliability plays a key role.
We are looking for an experienced engineer to help ensure that our real-time trading systems can scale safely, perform reliably under extreme market conditions, and recover gracefully from failures before issues impact clients.
Our TeamThe TRAX Reliability team partners closely with application and infrastructure engineers to embed scalability, resilience, and technical risk management into trading systems from the ground up.
Rather than reacting to production incidents, we take a data-driven, proactive approach . We study real production workloads and controlled experiments to understand how systems behave under load, how failures propagate, and where bottlenecks emerge. By connecting performance, capacity, and risk, we help teams plan for growth, traffic spikes, and adverse scenarios with clear scaling strategies and recovery expectations.
We also design and build tooling that continuously evaluates system risk and performance. This includes running targeted stress tests, collecting detailed metrics, and surfacing insights through real-time dashboards. These tools enable teams to quickly identify bottlenecks across services, queues, and infrastructure, and to understand their impact on client experience.
What’s in it for youHave direct impact on the stability and resilience of execution platforms relied upon by the world’s leading buy-side firms
Work on real-world, high-stakes distributed systems that need to operate under high performance and reliability requirements
Develop deep expertise in scaling, failure modes, and technical risk management for real-time trading systems
Collaborate with engineers across New York, London, and Frankfurt, significantly expanding your technical network
Partner with application, observability, and infrastructure teams to influence system design across the organization
Identify, prioritize, and track scalability and reliability risks across large-scale trading platforms
Partner with application teams to diagnose and address performance and resilience challenges
Analyze system behavior under real and simulated load, including latency, throughput, failover, and blast radius
Design and run chaos engineering experiments and game-day exercises to validate system capacity and resilience
Build and maintain automation and tooling for early detection and mitigation of production risks
Communicate technical trade-offs, solutions, and roadmaps clearly to engineering stakeholders
Plan for traffic growth and peak market events with clear scaling strategies and guardrails
5+ years of professional experience with a high-level programming language such as Python, Java, or C++, preferably on Unix/Linux
Solid understanding of Unix/Linux fundamentals
Hands‑on experience contributing to or triaging scaling and reliability issues in production distributed systems
Experience working with metrics, monitoring, or observability platforms, such as Grafana, Prometheus, or log analytics tools
Strong analytical skills and the ability to reason about complex system behavior and failure modes
Familiarity with chaos engineering, fault injection, or load testing frameworks
A track record of writing blameless postmortems and leading game‑day or incident review exercises
Curiosity and willingness to learn across all layers of the software and infrastructure stack
Why TRAX Reliability
You’ll work on systems that sit at the heart of global financial markets. Your work will directly influence how trades are executed, how platforms behave during…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).