Terraform Engineer - Azure-Centric
Listed on 2026-02-16
-
IT/Tech
AI Engineer, Cloud Computing
Dallas, United States | Posted on 02/02/2026
We are seeking a highly skilled Senior Terraform Engineer with deep expertise in Azure services to join our Enterprise AI Platform team
. This role is Azure-centric
, with a strong emphasis on deploying Machine Learning (ML) and Generative AI (GenAI) models in scalable, secure, enterprise environments
.
The ideal candidate will have hands‑on experience with multi‑cloud architectures
, Infrastructure as Code (IaC) best practices
, and a strong foundation in ML workflows, enterprise AI platforms, and cloud‑based ML services
. You will play a key role in automating infrastructure provisioning, integrating AI/ML pipelines, and optimizing deployments for performance, cost, security, and compliance across a multi‑cloud landscape.
This position requires a proactive engineer who can bridge Dev Ops and MLOps
, leveraging Terraform to support high‑impact AI initiatives. If you thrive in fast‑paced environments and are passionate about building robust, automated cloud infrastructures for AI at scale, this role offers a unique opportunity to drive innovation.
Design, implement, and maintain Infrastructure as Code (IaC) solutions using Terraform to provision and manage Azure resources, including:
- Related services supporting ML and GenAI model deployment
- Develop and enforce IaC best practices
, including automated policy and security testing using tools such as Terragrunt and Checkov
- Deploy and orchestrate ML and GenAI models on enterprise ML platforms
- Enable end‑to‑end automation across the ML lifecycle, from model training through inference
- Integrate AI/ML workflows with CI/CD pipelines (Azure Dev Ops, Git Hub Actions)
- Collaborate with data scientists, ML engineers, and cross‑functional teams to design multi‑cloud architectures
, with Azure as the primary platform and AWS/Google Cloud Platform integrations - Support hybrid deployments
, data sovereignty requirements
, and disaster recovery strategies - Implement cross‑cloud networking, identity federation, and resource orchestration
- Optimize cloud infrastructure for AI/ML workloads, including compute clusters
- Ensure infrastructure meets enterprise security, availability, and compliance standards (e.g., GDPR, SOC
2)
- Monitoring
- Alerting
- Leverage observability tools such as Azure Monitor
, Prometheus
, and MLflow to ensure reliable, production‑grade deployments
- Troubleshoot and resolve infrastructure issues in production AI environments
- Ensure high availability, scalability, and reliability of AI platforms
- Conduct code reviews, mentor junior engineers, and contribute to documentation for ML/GenAI‑specific IaC patterns
- Stay current with emerging Azure ML services
- Participate in on‑call rotations and incident response for critical AI infrastructure
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (or equivalent professional experience)
- 5+ years of experience as a Cloud Engineer, Dev Ops Engineer, or similar role
- At least 3 years of hands‑on experience with Terraform for IaC in Azure environments
- Proven experience deploying ML and GenAI models using Azure ML
, including managed endpoints and inference pipelines - Strong hands‑on experience with multi‑cloud architectures (AWS and/or Google Cloud Platform preferred)
- In-depth understanding of Terraform concepts, including modules, variables and outputs, work spaces and backends
- Solid understanding of the machine learning lifecycle
- Experience with containerization and orchestration tools:
Docker - Proficiency in scripting languages such as Python, Power Shell, or Bash
- Familiarity with cloud security best practices for ML environments, including encryption, access controls, vulnerability scanning
- Strong problem‑solving skills and experience working in Agile teams
- Hashi Corp Certified:
Terraform Associate - Experience with additional IaC tools such as ARM Templates, Bicep, Pulumi (for hybrid Azure setups)
- Background in MLOps tooling
, including MLflow - Experience with cloud cost optimization for AI workloads using tools like Azure Cost Management
- Prior experience working in regulated industries (finance, healthcare, etc.) with compliance‑driven infrastructure requirements
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).