AI/ML Model Compression & Quantization Engineer Job Airdrie area,Alberta Canada,Engineering

We’re a fast-paced, fabless semiconductor startup redefining the boundaries of AI through cutting-edge, scalable AI-infused multipurpose compute architecture. Our mission is to deliver scalable, efficient, and intelligent silicon solutions for the next generation of edge AI, robotics, autonomous systems, and mobile devices. Our leadership team brings together decades of experience in semiconductor innovation, spanning chip architecture, system design, and global business operations.

The team includes pioneers behind several generations of groundbreaking compute architectures, experts in software-hardware co-design, SoC and AI development with hundreds of patents in our portfolio as well as leaders of multi-billion-dollar business units at top-tier technology companies.

Position Overview

This is a great opportunity to join a highly-skilled AI/ML Software team working at the intersection of HW/SW co-design. In this role, you will be responsible for designing and executing end-to-end model compression pipelines, including sensitivity analysis, quantization, pruning, and hybrid optimization techniques across large-scale transformer architectures.

Key Responsibilities and Duties

Build and own the end-to-end compression pipeline

Baseline benchmarking and instrumentation
Sensitivity analysis

Implement layerwise sensitivity scoring frameworks

Design and apply quantization strategies

INT8, INT4, FP8, FP4 exploration
Per-layer/tensor precision assignment
Dynamic range calibration and scaling strategies

Implement and evaluate pruning techniques

Apply hybrid compression methods

QAT, LoRA-based recovery, distillation
Latency / throughput
Memory footprint

Optimize for iMachine Architecture

Qualifications and Skills

Successful candidates should possess the following qualifications and skills:

Required Qualifications (You must possess these qualifications to be considered for the position)

Bachelor of Science Degree in Electrical Engineering, Computer Science, Computer Engineering, or related field

1+ year of experience with PyTorch / JAX / Tensor Flow

Understanding of:

Numerical precision and quantization theory

Hands-on experience with:

Tensor

RT, ONNX Runtime, or similar inference stacks

Familiarity with:

Sparse representations (CSR, COO, RLC )
Low-rank approximation methods (SVD, factorization)

Ability to analyze:

Numerical stability issue

Preferred Qualifications

MS or PhD in Electrical Engineering, Computer Engineering, Computer Science, or related field

Experience with:

Hardware-aware optimization

Knowledge of:

Deliver production-ready compressed models with minimal accuracy loss
Achieve quantifiable performance gains (latency, memory, throughput)
Build reusable tooling and automation pipelines

Why Join Us

Get in early at a breakthrough deep-tech startup reshaping AI compute
Work closely with industry innovators and experienced leaders where your work will have a direct impact on the success of the company
Be part of a mission-driven team building foundational technology for the future
We balance sharp execution with continuous innovation to push the boundaries
Competitive compensation, equity, and growth opportunities

Benefits and Perks

At I Machines, Inc., we offer competitive salaries and a comprehensive benefits package, including:

Health, dental, and vision insurance
Retirement savings plans
Paid time off and holidays
Flexible Schedule

Equal Opportunity Employer

I Machines, Inc., is an equal opportunity employer and does not discriminate based on race, color, religion, gender, national origin, age, disability, or any other legally protected status. All qualified applicants will be considered for employment.

#J-18808-Ljbffr

AI​/ML Model Compression & Quantization Engineer

AI/ML Model Compression & Quantization Engineer