MLOps Engineer; Remote
Waukesha, Waukesha County, Wisconsin, 53186, USA
Listed on 2026-06-30
-
IT/Tech
Cloud Computing: Infrastructure & Operations, Data Engineering, SRE/Site Reliability
Project Details
Own the end-to-end lifecycle of production ML: training, packaging, deployment, monitoring, and governance. Build reusable pipelines and tooling so data scientists and contractors can ship reliable model quickly – batch and real-time – on Google Cloud.
Must Have Skills- 4+ years of MLOps/ML platform or Dev Ops for data/ML systems
- Hands on GCP experience:
Big Query, Cloud Run, Cloud Storage, Pub/Sub, Cloud Build (Vertex AI a plus) - Proficiency with Python, packaging (Docker), and CI/CD
- Solid SQL skills and understanding of data modeling for ML features/labels
- Experience operating production models with monitoring, alerting, and incident response
- Model registry & experiment tracking (ML Flow, W&B, or Vertex AI)
- Data validation & monitoring (Great Expectations, Tensor Flow Data Validation, Why Labs, Arize)
- Feature store concepts (BQ-based or managed)
- Canary/shadow deployments, autoscaling, and performance tuning
- IaC (Terraform), testing frameworks (unit/integration/lead), and observability (Open Telemetry, Cloud Monitoring)
- N/A
- Pipelines & orchestration: Design CI/CD and scheduled pipelines for training and inference (Cloud Build, Workflows/Scheduler, Pub/Sub, Cloud Run; Vertex Pipelines if used).
- Packaging & deployment: Standardize model packaging (Docker), artifact/versioning, and rollout strategies (A/B, canary, shadow) with automated rollbacks.
- Data/feature flows: Define contracts for features/labels in Big Query and manage backfills; support batch and (where applicable) streaming features.
- Registry & experimentation: Stand up a model registry and experiment tracking (MLflow/Weights & Biases/Vertex) with approvals and audit trails.
- Monitoring & quality: Implement data/feature validation, drift/decay monitoring, performance/latency SLOs, and alerting; build dashboards and playbooks.
- Security & compliance: Enforce IAM least privilege, service accounts, Secrets Manager, provenance/lineage, and change management.
- Cost & performance: Track training/inference cost and latency; optimize hardware/ autoscaling and query patterns.
- Enablement: Create templates, docs, and tooling so DS/contractors can add models with minimal friction.
- Data/Storage: Big Query, Cloud Storage (artifacts, datasets)
- CI/CD & IaC: Cloud Build or Git Hub Actions, Terraform
- ML Tooling: MLflow/W&B/Vertex, Docker, PyTorch/TF/XGBoost (as provided by DS)
- Monitoring: Cloud Logging/Monitoring, Evidently/Why Labs/Arize, custom run IDs & metrics
- Compute/Orchestration: Cloud Run, Workflows/Scheduler, Pub/Sub, Vertex Pipelines (optional)
- Data/Storage: Big Query, Cloud Storage (artifacts, datasets)
- CI/CD & IaC: Cloud Build or Git Hub Actions, Terraform
- ML Tooling: MLflow/W&B/Vertex, Docker, PyTorch/TF/XGBoost (as provided by DS)
- Monitoring: Cloud Logging/Monitoring, Evidently/Why Labs/Arize, custom run IDs & metrics
- Small, versioned releases; test-first pipelines; documented runbooks.
- Clear SLOs and blameless incident reviews.
Remote (working in CST hours)
CompensationHourly Rate Range – $40–$60/hr
Benefits OfferedHealth, Dental, Vision Insurance
DeadlineApplications accepted until 10/30/2025 at 11:59 PM CST
Equal Employment Opportunity StatementWe are an Equal Pay Employer. All employment decisions, including compensation, benefits, hiring, training, and promotions, are made based on merit, qualifications, and business needs. We do not discriminate on the basis of gender, race, ethnicity, age, disability, sexual orientation, or any other protected characteristic. We are committed to ensuring equal pay for equal work and regularly review our compensation practices to promote fairness, equity, and transparency across our organization.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).