Cloud/DevOps Engineer
Listed on 2026-02-12
-
IT/Tech
Cloud Computing, Data Engineer
100% REMOTE. EST HOURS only.
The Cloud Data Platform Engineer will play a central role in the deployment, monitoring, and optimization of cloud platforms used by our data scientists. This role requires hands‑on expertise in cloud infrastructure, modern Dev Ops practices, secure data operations, and automation frameworks. You'll partner closely with data scientists, machine learning engineers, and data engineers to ensure our analytics systems run securely, efficiently, and 'll work with our Platform Engineering team to define and implement these patterns so they can be codified and reused across our organization.
Location:
Remote, preferably US-based.
- Manage, scale, and optimize cloud environments used for data science workloads (primarily AWS, Databricks, dbt).
- Provision, maintain, and optimize compute clusters for ML workloads (e.g., Kubernetes, ECS/EKS, Databricks, Sage Maker).
- Implement and maintain high‑availability solutions for mission‑critical analytics platforms.
- Develop CI/CD pipelines for model deployment, infrastructure‑as‑code (IaC), and automated testing using industry standard tool chains.
- Build monitoring, alerting, and logging systems for cloud and ML infrastructure (e.g., Datadog, Cloud Watch, Prometheus, Grafana, ELK).
- Automate provisioning, configuration, and deployments using tools such as Terraform and Cloud Formation, Git Hub actions, etc.
- Enable and improve data ingestion, transformation, and model execution workflows through platform capabilities and automation.
- Develop and maintain self‑service capabilities for data scientists to provision and manage reliable, reproducible environments for research and development.
- Collaborate with Data Engineering to maintain integrations between data pipelines and cloud systems.
- Share responsibility for provisioning and operating application networking capabilities that support data platforms, including API gateways, CDNs, application load balancers, TLS, and WAFs.
- Implement and operationalize data science security and compliance controls for data science platforms in alignment with enterprise cloud standards.
- Conduct periodic risk assessments, best practice reviews, and remediation efforts to strengthen security and resiliency.
- Support secure handling of sensitive financial data.
- Partner with data scientists, machine learning engineers, and data engineers to deeply understand and support their needs and workflows within data‑driven initiatives.
- Serve as a technical advisor on cloud architecture, performance optimization, and production readiness for data and ML platforms.
- Adopt and champion Agile, Dev Ops, and Platform Engineering practices (kanban, scrum, continuous improvement, automation, Everything‑as‑a‑Service).
- Demonstrate a strong, proactive focus on serving internal customers, prioritizing user experience, identifying opportunities to leverage automation and self‑service to reduce toil and cognitive load for developers and researchers.
- A bachelor's degree or higher in a STEM field, required.
- 5+ years of experience in cloud operations, Dev Ops, platform engineering, SRE, sysadmin or related roles.
- Strong proficiency with at least one major cloud provider (AWS preferred).
- Hands‑on experience with IaC tools (Terraform, Cloud Formation, or similar).
- Strong scripting skills (Python, Bash, or Power Shell).
- Strong understanding of modern authentication and authorization technologies and secrets management (IAM, OIDC, OAuth2, RBAC, ABAC, privileged access management, JIT authorization, PKI).
- Experience with common CI/CD systems (Git Hub Actions, Jenkins, Git Lab CI, ArgoCD, or similar).
- Familiarity with container orchestration (Docker Compose, EKS/Kubernetes, ECS).
- Experience supporting data‑intensive or ML workloads.
- Experience in financial services, investment management, or other highly regulated industries.
- Knowledge of ML/AI platform tools (Databricks, Sage Maker, MLflow, Airflow).
- Hands‑on experience with AI Engineering and LLMOps tools (LLM observability, eval pipelines, building/supporting agentic workflows) are a huge plus.
- Understanding of networking, VPC architectures, and cloud security best practices.
- Familiarity with distributed compute frameworks (Spark, Ray, Dask).
Dexian is an Equal Opportunity Employer that recruits and hires qualified candidates without regard to race, religion, sex, sexual orientation, gender identity, age, national origin, ancestry, citizenship, disability, or veteran status.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).