Deep Learning Engineer-Model Compression; Technical Ba
Publicado en 2025-12-27
-
TI/Tecnología
Ingeniero de IA, Machine Learning, Científico de datos
Why join us?
We are a European deep-tech leader in quantum and AI, backed by major global strategic investors and strong EU support. Our groundbreaking technology is already transforming how AI is deployed worldwide — compressing large language models by up to 95% without losing accuracy and cutting inference costs by 50–80%.
Joining us means working on cutting‑edge solutions that make AI faster, greener, and more accessible — and being part of a company often described as a “quantum‑AI unicorn in the making.”
- Competitive annual salary
- Two unique bonuses: signing bonus at incorporation and retention bonus at contract completion.
- Relocation package (if applicable).
- Fixed‑term contract ending in June 2026.
- Hybrid role and flexible working hours
- Be part of a fast‑scaling Series B company at the forefront of deep tech.
- Equal pay guaranteed.
- International exposure in a multicultural, cutting‑edge environment.
We are seeking a skilled and experienced Deep Learning Engineer (Senior and Mid-level) with a strong background in deep learning to join our team. In this role you will have the opportunity to leverage cutting‑edge quantum and AI technologies to lead the design, implementation, and improvement of our computer vision and language models, and to work closely with cross‑functional teams to integrate these models into our products.
You will have the opportunity to work on challenging projects, contribute to cutting‑edge research, and shape the future of LLM and AI technologies.
- Design, train, and optimize deep learning models from scratch (including LLMs and computer vision models), working end‑to‑end across data preparation, architecture design, training loops, distributed compute, and evaluation.
- Apply and further develop state‑of‑the‑art model compression techniques, including pruning (structured/unstructured), distillation, low‑rank decomposition, quantization (PTQ/QAT), and architecture‑level slimming.
- Build reproducible pipelines for large‑model compression, integrating training, re‑training, search/ablation loops, and evaluation into automated workflows.
- Design and implement strategies for creating, sourcing, and augmenting datasets tailored for LLM pre‑training and post‑training, and computer vision models.
- Fine‑tune and adapt language models using methods such as SFT, prompt engineering, and reinforcement or preference optimization, tailoring them to domain‑specific tasks and real‑world constraints.
- Conduct rigorous empirical studies to understand trade‑offs between accuracy, latency, memory footprint, throughput, cost, and hardware constraints across GPU, CPU, and edge devices.
- Benchmark compressed models end‑to‑end, including task performance, robustness, generalization, and degradation analysis across real‑world workloads and business use cases.
- Perform deep error analysis and structured ablations to identify failure modes introduced by compression, guiding improvements in architecture, training strategy, or data curation.
- Design experiments that combine compression, retrieval, and downstream fine tuning, exploring the interaction between model size, retrieval strategies, and task‑level performance in RAG and Agentic AI systems.
- Optimize models for cloud and edge deployment, adapting compression strategies to hardware constraints, performance targets, and cost budgets.
- Integrate compressed models seamlessly into production pipelines and customer‑facing systems.
- Maintain high engineering standards, ensuring clear documentation, versioned experiments, reproducible results, and clean modular codebases for training and compression workflows.
- Participate in code reviews, offering thoughtful, constructive feedback to maintain code quality, readability, and consistency.
- Master’s or Ph.D. in Computer Science, Machine Learning, Electrical Engineering, Physics, or a related technical field.
- 3+ years of hands‑on experience training deep learning models from scratch, including designing architectures, building data pipelines, implementing training loops, and running large‑scale distributed training jobs.
- Proven…
Para buscar, ver y solicitar empleos que acepten solicitudes de su ubicación o país, toque aquí para realizar una búsqueda: