Backend Engineer
Listed on 2026-02-12
-
Software Development
Software Engineer
We're hiring a Staff Engineer - Workflows (Engine) to own and evolve the orchestration layer that powers automation at scale within High Level. You'll rebuild the engine that drives 80M+ workflow enrolments per day and nearly 1 billion action executions per day at peak load with reliability, performance, and clarity at its core.
You'll operate as a hands‑on individual contributor, designing and writing core engine logic in Go and Node.js, while defining contracts, reliability guarantees, and scaling patterns. This is a Staff Architect track role with deep technical ownership, not management. You'll have decision authority on engine architecture and are expected to set the gold standard for how distributed systems are built at High Level's automation team.
You'll join the Workflows (Engine) team, a small, autonomous group of top‑tier engineers within the Workflows organisation. The team's mission is to build the next‑generation orchestration engine that powers all automations at High Level: reliable, testable, and ready for future expansion.
This engine handles concurrency, ordering, retries, and scaling under unpredictable load. Your rewrite will make the system more robust, easier to extend with new actions and context types, and dramatically simpler to maintain. Ultimately, your work will enable faster feature delivery across the entire automation platform.
Day to day, you'll work closely with the Engineering Manager (Workflows) and a Lead Engineer from Workflow Core operating as the founding engineer for the new Engine org.
Responsibilities- Re-architecture:
Rebuild the Workflow Engine from Node.js to Go, creating a modular, high-performance foundation for billions of executions. - Core abstractions:
Design orchestration, state, retries, and execution guarantees with clear contracts and isolation boundaries. - Performance model:
Optimise for throughput-first execution while maintaining strict ordering within each workflow execution context. - APIs and contracts:
Define interfaces and schemas between Engine, Triggers, and Actions. Ensure consistent, reliable, and versioned communication. - Reliability and observability:
Partner with SRE to instrument metrics (latency, throughput, failure rate) and build replay and diagnostics tooling. - Operational ownership:
Own the engine's runtime incidents, RCA, and prevention. Deliver measurable reliability improvements (< 1% failures/day). - Migration and rollout:
Drive dual-run migration with progressive rollout and auto-rollback safety. - Engineering culture:
Set the technical benchmark for clarity, testability, and performance within Workflows and beyond.
- 10+ years of backend engineering experience with deep hands-on work in distributed systems, job schedulers, or orchestration engines.
- Advanced proficiency in Go (preferred) and Node.js, with experience writing low-latency, high-throughput microservices.
- Strong understanding of testability and isolation principles; you design systems that are easy to test, reason about, and extend.
- Production-grade database experience (Mongo
DB, Firestore, or equivalent) with sound data modelling. - Cloud experience (GCP, AWS, or Azure), especially event-driven services like Pub/Sub, SQS, or Cloud Tasks.
- Proven record of measurable performance wins reduced p95/p99 latency, improved throughput, or increased reliability.
- Strong fundamentals in concurrency, idempotency, ordering guarantees, and fault tolerance.
- Pragmatic engineering mindset, simplicity and clarity over abstraction for abstraction's sake.
- Strong applied understanding of design patterns and system architecture principles, able to model orchestration, state, and retries using proven, scalable patterns.
- Experience with automation systems (e. g., n8n, Zapier-like architectures) and the ability to generalise such models.
- Observability experience (Grafana, Prometheus, Open Telemetry, Kibana).
- Strong debugging and performance profiling skills.
- Open-source contributions or technical writing in distributed systems.
- Experience helping define engineering best practices or building early-stage teams.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).