×
Register Here to Apply for Jobs or Post Jobs. X

Cloud Inference Engineer

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Modular
Full Time position
Listed on 2026-05-29
Job specializations:
  • Software Development
    Software Engineer, Cloud Engineer - Software, DevOps, AI Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Requirements

  • We're seeking engineers who are passionate about pushing the boundaries of distributed inference systems and enjoy working at the intersection of large-scale systems and machine learning
  • ,
  • We are looking for candidates based on their breadth and depth of experience in backend engineering, AI inference, and distributed systems development
  • ,
  • 5+ years of experience working in backend engineering
  • ,
  • Experience with kubernetes and operating your own services
  • ,
  • Ability to create durable, reusable software tools and libraries that are leveraged across teams and functions
  • ,
  • Experience in machine learning technologies and use cases
  • ,
  • Creativity and curiosity for solving complex problems, a team-oriented attitude that enables you to work well with others, and alignment with our culture
  • ,
  • Strongly identifies with our core company cultural values
  • ,
  • (Desirable) Experience with high performance computing / networking
  • ,
  • (Desirable) Experience working on high scale ML inference infrastructure (traditional AI or genAI)
  • ,
  • (Desirable) Familiarity with golang
What the job involves
  • In the Cloud Inference team, we are focused on building end to end distributed LLM inference deployments that are fully vertically integrated with the MAX stack
  • ,
  • Our goal is to make inference both the fastest and most scalable while also building an easiest platform for deploying and scaling models for enterprises and developers alike
  • ,
  • If this sounds exciting, we invite you to join our world-leading AI infrastructure team and help drive our industry forward!
  • ,
  • Build & ship a LLM focused inference platform using best in class inference techniques (disaggregated inference, multi-node deployment of large models, high performance networking, distributed kv-cache management, high throughput batch processing, etc)
  • ,
  • Push the envelope for operational excellence with request-to-kernel observability, multi-cloud deployments, clever autoscaling, cold-start optimizations, and more
  • ,
  • Collaborate with our kernels and genAI teams to achieve SOTA application performance by integrating SOTA kernel & serving optimizations with SOTA cluster optimizations
  • ,
  • Build helm charts, kubernetes operators, and more to make a create simple, effective, maintainable deployments
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary