×
Register Here to Apply for Jobs or Post Jobs. X

Infrastructure Engineer; fully remote

Remote / Online - Candidates ideally in
Greater London, London, Greater London, W1B, England, UK
Listing for: Yocto Project
Remote/Work from Home position
Listed on 2026-06-18
Job specializations:
  • IT/Tech
    Cloud Computing: Infrastructure & Operations, Systems Engineer, SRE/Site Reliability, AWS
Salary/Wage Range or Industry Benchmark: 80000 - 100000 GBP Yearly GBP 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Staff Infrastructure Engineer (fully remote)
Location: Greater London

Overview

We are looking for a Staff Infrastructure Engineer to lead the technical direction and execution of balena

Cloud’s infrastructure and reliability architecture. As our customer base and device fleets expand globally, we need a dedicated technical lead to drive our transition into multi-region hosting and single-tenant dedicated instances, natively within Amazon Web Services (AWS).

At balena, we don't have traditional managers or hierarchy; we rely on high levels of trust, autonomy, and alignment. You will be joining at the Staff Level (Tactical scope / Domain Leader). Given the company strategy (the Why), you define the Tactics and the What, design the How, and heavily participate in the Do.

This role represents a dual leadership mandate: you will operate across both Infrastructure Engineering (planning for immense scale, multi-region hosting, and deep AWS automation) and Reliability Engineering (designing the observability tooling, defining operational procedures, and scaling the team's ability to debug and improve the system). Our infrastructure is deeply rooted in AWS, and we need an engineer who can drop in and be highly effective within this ecosystem immediately.

Your

Impact (Responsibilities)

As a Staff Level engineer, you are one of the most experienced team members in your domain. You are not a "ticket solver"; you gain significant autonomy but own the responsibility for your architectural decisions.

  • AWS-Native Architecture: Architect, automate, and optimize deeply integrated AWS environments. You will leverage the right AWS services to build a system that hosts balena

    Cloud reliably, delivering maximum performance and deep cost/resource optimization on a per-device basis.
  • Infrastructure & Reliability: Bridge the gap between building for scale and running for stability. You will not only design the infrastructure but also drive the reliability practices for our growing systems, driving continuous improvement, robust feedback loops, and incident resilience.
  • Architect for Massive B2B Scale: Design infrastructure capable of handling enterprise-level loads: billions of requests per week (>30 million/hour) and terabytes of data per day. Your mental model should align with massive B2B platforms rather than B2C media streaming.
  • Multi-Region & Single-Tenant Hosting: Own the technical tactics and execution to deploy single-tenant, single-region balena

    Cloud instances (e.g., dedicated instances in the EU, Australia, US, or Japan) to satisfy strict customer data sovereignty needs.
  • Kubernetes at Scale: Architect and manage multiple balena

    Cloud stacks simultaneously, overseeing the deployment and orchestration of many independent Kubernetes clusters for various customers.
  • Decade-Long Reliability: We are responsible for physical devices in the real world that will stay deployed for decades. Short-term, fragile infrastructure solutions are unacceptable, as they risk rendering devices lost in the field. Your designs and implementations must meet our >10-year durability bar.
  • Team Enablement & Async

    Collaboration:

    You will scale your knowledge across an overwhelmed engineering team. You will document, articulate, and demonstrate decision proposals based on objective facts and empirical evidence, minimizing the need for synchronous calls.
Essential Qualifications
  • Experience: Minimum of 6 years of highly relevant professional work experience in infrastructure and reliability engineering.
  • Deep AWS Expertise: Proven, hands‑on mastery of the AWS ecosystem. You must be able to navigate, architect, and optimize AWS services with immediate effectiveness.
  • Observability & Reliability: Deep understanding of Site Reliability Engineering principles. You have proven experience building highly usable observability tooling, metrics, and monitoring systems from the ground up to support high availability.
  • Exceptional Documentation

    Skills:

    Strong, hands‑on ability to write clear, actionable, and maintainable technical documentation, scaling plans, and onboarding materials for the team.
  • Distributed Systems: Proven experience in multiple geolocation hosting with distributed data and processing, specifically in multi‑tenant SaaS…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary