Senior AI/ML Engineer, CH
Listed on 2026-06-12
-
Software Development
AI Engineer (Applied/Software), Machine Learning/ ML Engineer, Data Engineering, Data Scientist
As a Senior AI/ML Engineer at vector8, you will design, implement, and deploy AI solutions that bridge the gap between research and production. Your work will focus on integrating and fine-tuning AI Models, optimizing the model performance, and ensuring enterprise-grade reliability, security, and scalability.
This is a hands-on engineering role where you will:
- Develop and optimize
LLMand VLM-powered solutions for enterprise use cases - Develop and optimize
TTS, STT and ML models. - Collaborate with cross-functional teams (data engineers,MLOps, cloud architects, and business stakeholders)
- Solve real-world enterprise challenges (security, compliance, legacy system integration)
- Own the full lifecycle of AI models, from data exploration to production monitoring
You will work closely with vector8’s Engineers and Project Managers to co-design AI foundations that enable organizations to scale AI from individual use cases to enterprise-wide capabilities.
The role is primarily based in Zurich, with occasional travel to client sites and collaboration with teams across Europe.
Job requirements- 5+ years of experience in AI/ML engineering, software development, or a related field
- Expertise in LLM architectures and training methodologies:
- Transformers, attention mechanisms, fine-tuning, RAG, quantization
- Prompt engineering, model evaluation, bias detection
- Strong knowledge of machine learning architectures: fully connected, CNN, LSTM,transformersand classical ML models
- Strong software engineering skills:
- Proficient in Python (FastAPI,Pydantic,asyncio, type hints)
- Experience with API development
- Familiarity with modern tool chains (Docker, Kubernetes, Terraform)
- Hands-on experience with LLM integrations:
- LLM providers
- Vector databases (Pinecone,Weaviate, Milvus)
- Model serving (vLLM, TGI,KServe)
- Experience with
MLOpsand production deployments - Understanding of enterprise challenges:
- Security, compliance, scalability, cost optimization.
- Experience with relational and non-relational databases.
- Strong problem-solving and debugging skills.
- Excellent communication and collaboration skills (fluent in English; German is a strong plus).
- Bachelor’s or Master’s degree in Computer Science, Mathematics, Physics, ora related field.
- Experience with multi-cloud environments (AWS, Azure, GCP).
- Experience with code optimization (e.g., model quantization, parallelization).
1. End-to-End Model Development
- Design, implement, and deploy distributed, high-volume, high-performance, low-latency machine learning solutions, with a focus on GenAI models, and especially LLM integrations and API-driven architectures
- Take ownership of your models throughout their entire life cycle:
- Data exploration and cleaning to build reproducible, versioned datasets
- State-of-the-artresearch toidentifythe best architectures for the problem (e.g., transformers, RAG, fine-tuning)
- Implementation, training, and optimization in reproducible environments
- Deployment, monitoring, and maintenance in production
- Optimizemodels for performance, latency, and cost efficiency, especially in LLM serving and inference
2. Software Engineering for AI
- Write clean, modular, and well-documented code in Python (FastAPI,Pydantic,asyncio)
- Apply best practices in:
- Testing (unit, integration, end-to-end)
- CI/CD (Git Hub Actions, Git Lab CI,ArgoCD)
- Observability (logging, monitoring, tracing)
- Ensure security and compliance (data protection, access controls, encryption)
- Integrate models and code into CI/CD pipelines for seamless deployment
3.AI& MLIntegration & API Development
- Design and implement
AI-powered solutions that integrate with APIs, microservices, and event-driven architectures - Develop and optimize
AIpipelines for:- Dataset cleaning,preprocessingand model training
- Fine-tuning (domain adaptation, instruction tuning)
- Retrieval-Augmented Generation (RAG) (vector databases, semantic search)
- Prompt engineering (optimizinginputs for performance, cost, and accuracy)
- Build scalable, secure, and cost-efficient serving infrastructure (e.g.,FastAPI,vLLM)
- Debug andoptimizeperformance (latency, throughput, token efficiencyfor Transformer based architectures)
- Deploy and monitor
AI models in production - Design and implement
MLOpspipelines…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: