Sr. Software Engineer, Inference
Listed on 2026-05-19
-
Software Development
Cloud Engineer - Software, Software Engineer, DevOps, AI Engineer
About Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
AboutThe Role
Our Inference team builds and maintains the critical systems that serve Claude to millions of users worldwide. We handle the entire stack from intelligent request routing to fleet‑wide orchestration across diverse AI accelerators. The team’s dual mandate is to maximize compute efficiency to serve explosive customer growth and to enable breakthrough research by providing scientists with high‑performance inference infrastructure to develop next‑generation models.
We tackle complex distributed systems challenges across multiple accelerator families and emerging AI hardware running in multiple cloud platforms.
- Designing intelligent routing algorithms that optimize request distribution across thousands of accelerators.
- Autoscaling compute fleet to dynamically match supply with demand across production, research, and experimental workloads.
- Building production‑grade deployment pipelines for releasing new models to millions of users.
- Integrating new AI accelerator platforms to maintain our hardware‑agnostic competitive advantage.
- Contributing to new inference features such as structured sampling and prompt caching.
- Supporting inference for new model architectures.
- Analyzing observability data to tune performance based on real‑world production workloads.
- Managing multi‑region deployments and geographic routing for global customers.
- Significant software engineering experience, particularly with distributed systems.
- Results‑oriented with a bias toward flexibility and impact.
- Ability to take on tasks that go beyond the job description.
- Enjoys pair programming.
- Wants to learn more about machine learning systems and infrastructure.
- Thrives in environments where technical excellence drives both business results and research breakthroughs.
- Concerned about the societal impacts of the work.
- High‑performance, large‑scale distributed systems experience.
- Implementing and deploying machine learning systems at scale.
- Experience with load balancing, request routing, or traffic management systems.
- LLM inference optimization, batching, and caching strategies.
- Kubernetes and cloud infrastructure (AWS, GCP, Azure).
- Python or Rust proficiency.
Annual Salary: $300,000—$485,000 USD.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).