Principal Machine Learning Engineer, ML Platform
Oregon, Lucas County, Ohio, 43616, USA
Listed on 2026-02-15
-
IT/Tech
Machine Learning/ ML Engineer, AI Engineer
Principal Machine Learning Engineer, ML Platform
About Shippo
At Shippo, our vision is bold and clear:
we are the shipping layer of the internet. Our mission is to make every merchant successful through excellent shipping,delivering world-class logistics technology and infrastructure. We’re building the backbone of global e-commerce — connecting merchants to carriers worldwide through a single API and intuitive dashboard.
As a remote-first and globally distributed team , we believe flexibility fuels trust, autonomy, and performance. Our diverse perspectives — across continents, cultures, and time zones — drive our innovation and enable us to build solutions used by businesses everywhere. We invest in modern, scalable technology so our teams can build, ship, and iterate with confidence.
Your impact starts here: every person at Shippo plays a direct role in shaping the infrastructure that powers global commerce and makes shipping simpler for businesses around the world.
How we will deliver success together:
Shippo is expanding applied ML across core business problems — delivery-date prediction, fraud detection, anomaly detection, and other optimizations in shipping logistics. To deliver these capabilities reliably and at scale, we need a standardized, production-grade ML platform that makes it easy to develop, test, deploy, and operate models.
This Principal ML Platform Engineer will build the “paved roads” that reduce time-to-production, improve model and service reliability, lower operational risk, and advance ML workflows. Our stack is Databricks-centric today, but we want a vendor-agnostic leader who can advise when Databricks is the right fit and when alternative approaches are better for performance, cost, or operational simplicity. This role will directly increase ML product velocity and improve the consistency and quality of ML systems that power customer-facing experiences and internal decision-making.
- Set technical strategy and drive a multi-quarter roadmap for ML platform capabilities aligned to Shippo’s business priorities.
- Own cross-team architecture decisions, RFCs, and design reviews for ML lifecycle and inference.
- Raise the engineering bar through mentorship, production readiness standards, and reusable platform primitives.
- Be accountable for platform adoption, reliability, and cost-performance outcomes.
- Build and operate core ML platform components:
- ML lifecycle foundation (experiment tracking, reproducibility, artifact management, model registry, versioning, and controlled promotion workflows using MLflow or equivalent).
- Training and experimentation enablement (standardized environments, reusable pipelines/templates, evaluation harnesses, and repeatable workflows that let data scientists move from exploration to production with confidence).
- Kubernetes-native model serving for real-time inference (safe rollout and rollback, autoscaling, reliability practices, and cost controls).
- Batch inference and scoring pipelines (repeatable backfills, retraining triggers, consistent packaging between training and inference).
- Observability for ML systems (service health metrics, alerting, and model-quality signals such as drift and data quality).
- Developer experience (templates, reference implementations, documentation, and self-service workflows).
- Evaluate and recommend inference frameworks and deployment patterns, and document tradeoffs for Shippo’s workloads.
- Identify and resolve performance bottlenecks across the inference stack (model runtime, compute utilization, networking, serialization, and autoscaling behavior).
- Establish ML engineering standards across training, evaluation, testing, model packaging, CI/CD, production readiness, and incident response.
- Partner with Data Science teams to bridge research and production environments by creating repeatable frameworks, shared standards for code quality and reproducibility, and self-serve paths to deploy models safely.
- Collaborate with Data and Engineering teams to ensure the platform supports real workflows, drives adoption, and meets reliability expectations.
- Mentor engineers through design reviews, architecture guidance, and shared best practices across…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).