×
Register Here to Apply for Jobs or Post Jobs. X

AI Infra Engineer

Job in San Jose, Santa Clara County, California, 95199, USA
Listing for: Black Sesame Technologies Inc
Full Time position
Listed on 2026-06-23
Job specializations:
  • IT/Tech
    AI Engineer (Applied/Software), Machine Learning/ ML Engineer, Systems Engineer
  • Engineering
    AI Engineer (Applied/Software), Systems Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

We are looking for a highly motivated HW/Algorithm Co-Design Engineer to join our NPU Hardware Architecture Team
. In this role, you will operate at the critical intersection of AI model innovation and silicon architecture
, working closely with both algorithm and hardware teams to ensure that state-of-the-art models are efficiently mapped, optimized, and deployed on our in-house NPU platform.

This is a high-impact role for engineers who are passionate about bridging the gap between model design and hardware execution
. You will help shape model-friendly architecture practices, drive cross-functional optimization, and influence the evolution of next-generation AI computing platforms from a system-level perspective.

Key Responsibilities
  • Partner closely with algorithm teams to understand model architectures, operator patterns, training/inference workflows, and deployment requirements, and guide them toward NPU-friendly design choices.
  • Analyze AI models from a hardware architecture perspective, identifying bottlenecks in compute, memory access, data movement, bandwidth utilization, and parallelism.
  • Drive hardware/algorithm co-design initiatives to improve model efficiency, performance, energy efficiency, and deployability on our in-house NPU.
  • Define and promote best practices for NPU-friendly model design, including operator selection, graph patterns, quantization readiness, memory-efficient structures, and execution-friendly network topologies.
  • Collaborate with hardware architects, compiler engineers, runtime/software teams, and algorithm researchers to enable end-to-end optimization across the full stack.
  • Evaluate emerging AI models and workload trends, and identify opportunities to improve future NPU capabilities through architecture-aware algorithm guidance.
  • Serve as a technical bridge between algorithm innovation and hardware realization, ensuring that advanced models can be translated into scalable and efficient production deployments.
Qualifications
  • Master’s degree or above in Computer Science, Electrical Engineering, Computer Engineering, Applied Mathematics, or a related field.
  • 3+ years of relevant industry experience
    , preferably in AI accelerators, NPU/GPU architecture, deep learning systems, or hardware/software co-design.
  • Strong understanding of deep learning fundamentals and modern model architectures such as CNNs, Transformers, and other large-scale AI models.
  • Solid knowledge of AI accelerator architecture concepts, including compute engines, memory hierarchy, dataflow, bandwidth constraints, and parallel execution.
  • Proven ability to analyze model behavior and identify architecture-sensitive performance bottlenecks.
  • Familiarity with common AI frameworks and deployment tool chains such as Py Torch ,
    ONNX
    , and model profiling/optimization tools.
  • Strong problem-solving skills, with the ability to reason across algorithm, software, and hardware layers.
  • Excellent communication and cross-functional collaboration skills, with the ability to work effectively across multiple engineering disciplines.
Preferred Qualifications
  • Experience with in-house NPU/ASIC development, AI compiler stacks, or performance modeling for ML workloads.
  • Familiarity with model optimization techniques such as quantization, sparsity, operator fusion, graph optimization, and low-precision computation.
  • Hands-on experience optimizing real-world workloads in areas such as computer vision, autonomous driving, multimodal AI, or large language models.
  • Experience in workload characterization, roofline analysis, memory bandwidth analysis, or architecture/performance tradeoff studies.
  • Demonstrated success influencing model design decisions based on hardware execution characteristics.
What We Value
  • A system-level mindset and the ability to connect model behavior with architectural implications.
  • Strong technical curiosity and the drive to push both algorithm efficiency and hardware capability forward.
  • A practical engineering approach focused not only on making models run, but on making them run efficiently, robustly, and competitively on our platform.
  • The ability to thrive in a highly collaborative environment where architecture, software, and algorithms evolve together.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary