×
Register Here to Apply for Jobs or Post Jobs. X

Lead Cloud Reliability Engineer

Job in Calgary, Alberta, D3J, Canada
Listing for: BigGeo Inc.
Full Time position
Listed on 2026-06-04
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing, Cybersecurity
Salary/Wage Range or Industry Benchmark: 100000 - 125000 CAD Yearly CAD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Careers Engineering

Lead Cloud Reliability Engineer

Job Description

Big Geo is the Spatial Cloud.

We help companies manage and access the world’s spatial data.

Any size, any slice, any insight.

Delivered in seconds.

We’re building something that hasn’t existed before: a new layer of the internet where the “where” and “when” behind every decision is instantly clear, programmable, and actionable. Our platform removes the complexity that has kept spatial data locked in silos for decades, and replaces it with speed, precision, and control.

We’re a Calgary-based company, early and moving fast, with real customers, real infrastructure, and a clear point of view on where the world is going.

Why Big Geo Exists and Why People Build Here

Most companies are spatially blind. They know what their data says, but not where or when things actually happen. That gap costs real money, creates real risk, and limits what AI can actually do in the physical world.

Big Geo exists to close that gap.

We’re not building another tool. We’re building the rails that connect the planet’s moving data to the systems that run the world. That’s a big problem, and it takes people who care about doing things right, not just fast.

People build here because:
  • The problem is real and the category is open. We’re not competing for the middle of an existing market. We’re defining a new one. Your work shapes what the category becomes.
  • Your fingerprints are on the architecture. We’re at the stage where the decisions you make today become the foundation tomorrow. What you ship matters.
  • We run on clarity, not politics. We move with purpose. No bureaucratic drag, no HiPPO decisions, just a team that agrees on the mission and gets to work.
  • You’ll grow fast because the problems are hard. Spatial data at scale is a genuinely difficult domain. If you want to be stretched, you’ll be stretched.
  • We’re building for longevity. We’re not chasing hype cycles. We’re building infrastructure, the kind that compounds in value over time and earns the trust of the companies that depend on it.
The Role

Big Geo is looking for a Lead Cloud Reliability Engineer to design and operate the systems that keep The Spatial Cloud running reliably s role sits at the intersection of hands‑on infrastructure engineering and technical leadership, and it carries real ownership over how dependable our platform feels to the customers, systems, and AI agents that run on top of it.

You’ll be responsible for the reliability architecture that supports spatial compute, data pipelines, and platform services across the Spatial Cloud. Working side‑by‑side with platform engineers, data engineers, and spatial compute teams, you’ll make sure the systems we ship are observable, resilient, and ready to handle large‑scale spatial workloads in production.

This is also a leadership seat. You’ll help set the reliability practices, operational standards, and automation systems that keep the platform stable as it scales across industries and global datasets. If you want to shape how a category‑defining infrastructure company runs in production, this is the role.

Key Responsibilities Reliability Architecture
  • Design reliability patterns for distributed services across the Spatial Cloud, including failure isolation, graceful degradation, and multi‑region resilience.
  • Ensure systems are fault‑tolerant, production‑ready, and capable of meeting well‑defined SLOs and error budgets.
  • Guide architectural decisions that materially improve platform stability, throughput, and predictability under load.
Observability and Monitoring
  • Build and maintain monitoring, logging, and tracing systems that give every engineer clear visibility into system health, latency, and saturation.
  • Define and maintain meaningful SLIs, SLOs, and alert thresholds that catch real problems without creating noise.
  • Create dashboards, runbooks, and alerting systems that turn raw telemetry into operational awareness the whole team can act on.
Incident Response and Recovery
  • Lead investigation and resolution of reliability incidents, including high‑severity production events.
  • Improve detection, escalation, and recovery processes so service disruptions are shorter,…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary