More jobs:
Job Description & How to Apply Below
At Taalas we believe that fundamental progress is achieved by those who are willing to understand and assail a problem end-to-end, without regard for commonly accepted abstractions and boundaries. We are building a team of hands‑on technologists who dislike overspecialization and seek to excel in both depth and breadth. In this position the successful candidate will build software infrastructure for an inference serving cluster built around Taalas hardcore AI model chips.
Job Responsibilities
Adapt open‑source inference servers like vLLM and Punica to interface with Taalas’ hardcore AI models
Implement a highly efficient LoRA swapping solution for multi-{tenant,LoRA} environments
Build and test a scalable inference serving cluster using K8 and Traefik or similar
Qualifications
Bachelor’s or higher degree in Computer Science, or Electrical/Computer Engineering
Experience with K8, HTTP load balancers, web‑servers
Good knowledge of computer architecture and low‑level programming:
Linux virtual memory and page table management, direct memory access, CUDA
Familiarity with ML, Python and Pytorch
Interested in joining our team? Submit your resume to to be considered for the exciting opportunity!
Seniority level:
Entry level
Employment type:
Full‑time
Job function:
Engineering and Information Technology
Industries:
Semiconductor Manufacturing
#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×