×
Register Here to Apply for Jobs or Post Jobs. X

Model Training Acceleration Engineer

Job in San Jose, Santa Clara County, California, 95112, USA
Listing for: TikTok
Apprenticeship/Internship position
Listed on 2026-06-27
Job specializations:
  • IT/Tech
    Machine Learning/ ML Engineer, AI Engineer (Applied/Software)
Salary/Wage Range or Industry Benchmark: 156000 - 316800 USD Yearly USD 156000.00 316800.00 YEAR
Job Description & How to Apply Below
Position: Large Model Training Acceleration Engineer

Large Model Training Acceleration Engineer

Location:

San Jose

Employment Type:

Regular

Job Code: A180280A

Responsibilities

The Intelligent Creation - AI Platform team is a team focusing on building advanced end-to-end AI production pipelines, including deep learning model training, optimization, deployment and applications. We provide AI capabilities to empower content creation and consumption on Tik Tok and serve billions of users. We are seeking an experienced AI model optimization engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration.

The ideal candidate will work at the cutting edge of AI efficiency, enhancing the performance, scalability, and deployment of large-scale generative AI models.

Responsibilities:

  • Optimize large model training pipelines to improve efficiency, speed, and scalability.
  • Develop and improve distributed training strategies such as data parallelism, model parallelism, pipeline parallelism and communication to accelerate model training.
  • Benchmark and profile deep learning models to identify performance bottlenecks and optimize computational resources.
Qualifications

Minimum Qualifications:

  • Master's or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or a related field.
  • 2+ years of experience in AI model training optimization.
  • Strong software engineering skills, including proficiency in Python, C++, and CUDA.
  • Strong proficiency in deep learning frameworks such as PyTorch, Megatron and Deepspeed.
  • Experience with distributed training techniques such as data parallelism, model parallelism, and pipeline parallelism.
  • Knowledge of transformers and diffusion models.

Compensation Description (Annually):
The base salary range for this position in the selected city is $156000 - $316800 annually.

Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.

Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).

The Company reserves the right to modify or change these benefits programs at any time, with or without notice.

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary