×
Register Here to Apply for Jobs or Post Jobs. X

Neural Network Optimization Engineer

Job in City Of London, Central London, Greater London, England, UK
Listing for: Recraft, Inc.
Full Time position
Listed on 2025-12-30
Job specializations:
  • Software Development
    AI Engineer, Data Scientist
Job Description & How to Apply Below
Location: City Of London

About Us

Founded in the US in 2022 and now based in London, UK, Recraft is an AI tool for professional designers, illustrators, and marketers, setting a new standard for excellence in image generation.

We designed a tool that lets creators quickly generate and iterate original images, vector art, illustrations, icons, and 3D graphics with AI. Over 3 million users across 200 countries have produced hundreds of millions of images using Recraft, and we’re just getting started.

Join a universe of professional opportunities, develop and support large‑scale projects, and shape the future of creativity. We are committed to making Recraft an essential, daily tool for every designer and setting the industry standard. Our mission is to ensure that creators can fully control their creative process with AI, providing them with innovative tools to turn ideas into reality.

If you’re passionate about pushing the boundaries of AI, we want you on board!

Job Description

We are seeking an experienced Neural Network Optimization Engineer who will specialize in enhancing the performance, latency, and throughput of neural network inference workflows. The ideal candidate will have substantial hands‑on experience optimizing inference workloads using technologies such as Tensor

RT, Triton language, and model quantization techniques. You will collaborate closely with ML researchers to ensure that our machine learning models run at peak efficiency and reliability in production environments.

Key Responsibilities
  • Optimize neural network models for inference performance and latency reduction
  • Implement model quantization methods (e.g., INT8, FP8) to maximize computational efficiency.
  • Benchmark, analyze, and improve inference performance on targeted hardware platforms.
  • Collaborate with the ML researchers to deploy optimized models in production environments.
  • Stay updated with the latest developments in model optimization, inference engines, quantization methods, and related technologies.
Requirements
  • Proven professional experience optimizing neural network inference workloads.
  • Strong expertise with Tensor

    RT, Triton language, CUDA programming.
  • Experience with neural network quantization techniques.
  • Proficiency in Python and PyTorch.
  • Deep understanding of GPU architectures and performance optimization.
  • Excellent problem‑solving skills and ability to analyze performance bottlenecks.
What We Offer
  • Competitive salary.
  • We’re able to offer Skilled Worker visa sponsorship in the UK for qualified candidates.
  • Opportunities for professional growth and development.
  • A collaborative and user‑focused work environment.
  • The chance to shape the future of AI‑powered creativity through research.
  • Exciting projects where your insights will directly impact product development.
#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary