MLOps Engineering Specialist
Listed on 2026-06-11
-
IT/Tech
AI Engineer (Applied/Software), Machine Learning/ ML Engineer
Job Title:
MLOps Engineering Specialist
Req
Job Function:
Software Engineering
Posting
Start Date:
03/06/2026
Posting End Date: 16/06/2026
Division:
Networks
Job Location:
G
-London-BTHQ One Braham
Advertised Salary:
Competitive + great benefits
Working locations: Bristol, London
Working Style: 3 days a week in office, 2 days from anywhere
AboutThe Role
We are looking for an AWS MLOps Engineering Specialist to deploy and operate production‑grade machine learning platforms using the AWS Sage Maker MLOps framework. This role focuses on enabling the full ML lifecycle—data preparation, model deployment, monitoring, and retraining—through standardised, automated, and governed pipelines. You will work at the intersection of data science, cloud engineering, and Dev Ops, ensuring models built in Sage Maker can be reliably deployed at scale, monitored for drift and performance, and governed in line with enterprise and regulatory expectations.
You will play a key role in standardising ML lifecycle practices, automating pipelines, and embedding operational excellence, security, and cost efficiency into AI/ML workloads.
- Design and implement end‑to‑end MLOps workflows using AWS Sage Maker, including:
- Sage Maker Pipelines for training and orchestration.
- Sage Maker Feature Store for feature management.
- Sage Maker Model Registry for model versioning and approvals.
- Sage Maker Experiments for lineage and metadata tracking.
- Enable consistent promotion of models across environments (dev / test / pre‑prod / prod).
- Implement automated retraining strategies triggered by data or performance changes.
- Implement and mature an MLOps framework covering code/data/model versioning, automated testing, release governance, rollback strategies and environment promotion controls.
- Apply security‑by‑design across Sage Maker workloads by adopting IAM least‑privilege roles for training, pipelines and endpoints and ensuring Network isolation using VPC‑attached Sage Maker resources.
- Implement model monitoring (data quality, model quality, bias drift, feature attribution drift) and alerting driving automated responses such as retraining triggers and controlled redeployments.
- Put in place drift detection, evaluation routines, and model performance reporting; partner with data science to define thresholds, baselines and acceptance criteria.
- Define standards for documentation, change management and quality gates that reduce MTTR and improve platform reliability.
- Partner with data scientists to product ionise notebooks and experiments into managed pipelines.
- Build scalable inference solutions using Sage Maker real‑time and serverless endpoints.
- Strong hands‑on experience with MLOps practices: CI/CD, versioning (code/data/model), release governance, and production monitoring.
- Strong AWS experience, particularly with Amazon Sage Maker for ML deployment and monitoring including drift/quality monitoring approaches.
- Experience designing observability for serverless systems (logs/metrics/traces) and implementing distributed tracing and dashboards using open standards and AWS tooling.
- Containerisation experience (Docker) and familiarity with custom Sage Maker containers.
- Familiarity with monitoring, alerting, and incident response for ML platforms.
- Infrastructure‑as‑Code (Terraform, Cloud Formation, or CDK).
- Experience with supporting AWS services such as S3, ECR, IAM, Lambda, Step Functions, Glue, and VPC networking.
- Awareness of data privacy, model governance, and responsible AI considerations.
- Understanding of cost optimisation for training and inference workloads.
- Access, use, and disclose information only as required for the job; ensure appropriate safeguards and adherence to Information Security policies.
- Excellent verbal and written communication and interpersonal skills.
- Knowledge of data governance, lineage, and model explainability practices.
- AWS certifications (at least one of these):
Dev Ops Engineer Professional, Machine Learning Engineer – Associate, AI Practitioner for GenAI fundamentals.
Tailored benefits make a real difference. That’s why we offer a comprehensive range to support your growth, wellbeing, and everyday life.
You can design the package to suit you and your lifestyle. Your core benefits include:
- 10% on target annual bonus
- Access to an online private GP 24/7 for you and your immediate family
- Market‑leading paid carers leave with up to 2 weeks off
- Equalised maternity, paternity, and adoption leave – 18 weeks’ full pay and 8 weeks’ half pay
- Discounted EE and BT products, including mobile and broadband
- Market leading Pension scheme – 5% from you and 10% from us
- Holiday purchase scheme
You can select additional benefits, including healthcare, dental, gym memberships and more when you’re ready.
#J-18808-LjbffrTo Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: