More jobs:
Job Description & How to Apply Below
Overview Alpaca is a US-headquartered self-clearing broker-dealer and brokerage infrastructure for stocks, ETFs, options, crypto, fixed income, 24/5 trading, and more. Our recent Series D funding round brought our total investment to over $320 million, fueling our ambitious vision. Amongst our subsidiaries, Alpaca is a licensed financial services company, serving hundreds of financial institutions across 40 countries with our institutional-grade APIs.
This includes broker-dealers, investment advisors, wealth managers, hedge funds, and crypto exchanges, totaling over 9 million brokerage accounts. Our global team is a diverse group of experienced engineers, traders, and brokerage professionals who are working to achieve our mission of opening financial services to everyone on the planet . We re deeply committed to open-source contributions and fostering a vibrant community, continuously enhancing our award-winning, developer-friendly API and the robust infrastructure behind it.
Alpaca is proudly backed by top-tier global investors, including Portage Ventures, Spark Capital, Tribe Capital, Social Leverage, Horizons Ventures, Unbound, SBI Group, Derayah Financial, Elefund, and Y Combinator.
Our team members:
We re a dynamic team of 230+ globally distributed members who thrive working from our favorite places around the world, with teammates spanning the USA, Canada, Japan, Hungary, Nigeria, Brazil, the UK, and beyond.
Your Role As a Site Reliability Engineer (SRE) at Alpaca, you will be responsible for ensuring the reliability, scalability, and performance of our systems and services. You will work closely with development, operations and Dev Ops teams to build and maintain robust applications, ensuring they run smoothly and efficiently. This role requires a blend of software engineering and operations skills, with a strong ability to troubleshoot technical issues and resolve problems before they impact our users.
Things You Get To Do Triage difficult technical problems and implement solutions
Enhance our Rabbit
MQ and Redpanda observability stack by defining Service Level Objectives (SLOs) and alerts, as well as implementing profiling and logging
Improve our Rabbit
MQ and Redpanda clients reliability
Incident Management:
Respond to and resolve incidents in a timely manner, conducting post-incident reviews to identify and implement improvements
Collaboration:
Work closely with development teams to ensure new features and services are designed with reliability and scalability in mind
Capacity Planning:
Monitor system capacity and performance, making recommendations and implementing changes to handle future growth
Who You Are (Must-Haves) 5+ years of experience in Site Reliability Engineering, Performance Engineering, or similar roles
5+ years of experience with message brokers similar to Kafka, Rabbit
MQ, and Redpanda
Proven track record of managing and maintaining large-scale, high-availability, and high-performance distributed systems
Experience designing and implementing SLIs, SLOs, and SLAs for internal and third-party systems with comprehensive alerting and monitoring
Strong ability to work independently, lead and deliver on large tasks, and collaborate with other members of the organization or external partners
Significant production experience with Kubernetes
Proficient with Go
Proficient with Prometheus
Proficient with Linux
Experience with troubleshooting message broker performance issues
Nice-to-Haves Knowledgeable in trading/fintech domains
Experience with low-latency systems
Experience with Loki and Tempo
Experience with distributed tracing
Experience with the USE method
Experience with perf, bpf, pprof
How We Take Care of You Competitive Salary & Stock Options
Health Benefits
New Hire Home-Office Setup:
One-time USD $500
Monthly Stipend: USD $150 per month via a Brex Card
Alpaca is proud to be an equal opportunity workplace dedicated to pursuing and hiring a diverse workforce.
Recruitment Privacy Policy
#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×