Senior Cloud Engineer
Listed on 2026-05-16
-
IT/Tech
Systems Engineer, Cloud Computing
Clayco is a full-service, turnkey real estate development, master planning, architecture, engineering, and construction firm that safely delivers clients across North America the highest quality solutions on time, on budget, and above and beyond expectations. With $8.1 billion in revenue for 2025, Clayco specializes in the art and science of building, providing fast track, efficient solutions for mission critical, industrial, life sciences, power & energy, aviation, commercial, institutional, residential and sports & entertainment related building projects.
The Role We Want You ForCloud Engineering owns the day-to-day operation, evolution, and reliability of our cloud estate. The majority runs in Azure (application gateways, App Service Environments, storage, contains, Functions, VMs, and Azure SQL), with traditional lift-and-shift constructs still doing real work alongside more modern patterns. AWS is a meaningful and growing part of the picture too. This is the team that keeps that environment running well: provisioning, optimizing, troubleshooting, securing, and moving workloads onto code-defined patterns.
This role brings modern IaC, automation, and engineering discipline to the cloud estate directly, and helps raise the bar on what production-ready looks like here. his is an AI-forward position. Senior leadership is all-in on AI, and we want someone who genuinely uses it every day for code, troubleshooting, documentation, and reasoning. Not as a buzzword. As a daily multiplier.
Specifics of the Role
- Everything you build and operate has one measure: is the cloud estate more reliable, more secure, more cost-efficient, and more transparent than it was yesterday?
- Design, provision, and operate the services that run our environment. Most of this is Azure today:
App Service and ASE, Azure Storage, Azure SQL Database and Managed Instance, Azure Functions, VMs and VM Scale Sets, and container workloads on AKS, ACI, and ACR. The same patterns apply to AWS workloads where they run there. - Own application-layer traffic for the workloads you run (Application Gateway, Load Balancer, Front Door, Traffic Manager in Azure today; AWS equivalents follow the same patterns), partnering with Network Engineering on the underlying connectivity.
- Operate, troubleshoot, and tune workloads for performance, capacity, cost, and security posture. When something is wrong, you're the one who finds it and fixes it.
- Drive disciplined improvement of existing lift-and-shift workloads:
PaaS-ification where it makes sense, right-sizing, decommissioning, and modernization that keeps things running while making them better. - Ship infrastructure with Terraform as the primary tool. Use existing modules where they exist, build new ones where they don't, and establish patterns that scale as the estate grows.
- Use Ansible for configuration management and Packer for image baking; treat golden images, hardened baselines, and post-provision configuration as code, not one-off changes.
- Apply the same discipline to AWS, using Cloud Formation and Terraform with consistent cross-cloud patterns.
- No click-ops as the durable answer. If you made a change in the portal to fix something today, the follow-up is to encode it in IaC tomorrow.
- Use AI tooling (Git Hub Copilot, ChatGPT, Claude, agent frameworks) every day for code, IaC, troubleshooting log analysis, and documentation. You should be visibly faster because of it.
- Build AI-assisted automation into operational work: incident triage, runbook execution, drift detection, change summarization, cost analysis. Look for toil and remove it.
- Explore agentic patterns (workflow engines, autonomous tasks, intelligent automation) and bring operational reality to those experiments.
- Implement and enforce security guardrails:
Defender for Cloud findings, Key Vault hygiene, identity and access patterns (Entra , managed identities, RBAC), policy as code, secret handling, and network segmentation. - Build observability into everything you ship: dashboards, monitors, and SLOs defined as code, with alerts that tell on-call what's actually wrong at 2 AM. Strong experience with any modern observability platform translates.
- Treat cost as a first-class engineering concern. Tag discipline, reservations, scaling policies, and steady right-sizing, not heroic quarterly cleanups.
- Contribute to disaster recovery and business continuity patterns (backups, failover, recovery testing) built into the infrastructure rather than bolted on.
- Deep, hands-on Azure experience (5+ years) across the services listed above:
App Service/ASE, Storage, AKS/containers, Functions, VMs, Azure SQL, and app-layer traffic. You've built it, broken it, fixed it, and operated it. - Working knowledge of Azure networking (VNets, NSGs, Private Endpoints, hybrid connectivity, DNS), sufficient to design well-formed workloads and collaborate effectively with Network Engineering.
- Terraform fluency: modules, state, work spaces, and real production usage.
- Scripting and…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).