Senior Platform Engineer
Listed on 2025-12-27
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability, Data Engineer
We are looking for a Senior Platform Engineer to join our Platform squad in the Data & Infrastructure tribe. We’re responsible for building, maintaining and scaling the system infrastructure that handles tens of millions of requests a day as well as its security and reliability.
You’ll work as part of a close-knit, collaborative team to provide the best experience for our customers and our engineers.
For our engineers, we aim to provide the best deployment experience, the greatest visibility into application and service performance, and the best ways of managing and maintaining the services for which they are responsible. We need an engineer willing to engage directly with engineers across the business to understand what’s slowing them down, what tooling they’d love to see and how we can best help them to focus on delivering the product at Lyst.
For our customers, we aim to provide a fast, reliable experience, serving millions of images and terabytes of data every day across multiple continents. We need an engineer with attention to detail who’ll help the team by working to understand all our metrics and networking, spotting issues that may be causing a bad experience for our customers and helping create solutions.
Whatwill you be working on?
- Deployment: orchestrating Terraform, Docker and Git Hub to enable our engineers to deploy the code for our 200+ applications, services and APIs
- Automation: managing systems programmatically and building tooling to enable engineers to manage their infrastructure, metrics and code declaratively; providing a simple, consistent experience that lets them focus on product value
- Scale:
Bringing Lyst to a billion requests per month, using Cloudflare, AWS and Kubernetes to run Lyst at a huge, distributed scale efficiently - Monitoring and alerting: measuring application performance and delivering insights, metrics and relevant alerts to the engineering teams with ELK, Grafana and New Relic
- Ownership: driving engineering teams to own their infrastructure and costs by building great tooling, visibility and documentation
- Security:
Setting the standards for fine-grained access to systems and applications to allow automated access control and support auditing requirements
- Autonomy: you’ll have the ability to determine what software, tooling and services we should use and why - the Lyst values encourage trust and a focus on impact
- Mentoring: you’ll work with other operations‑minded engineers around the business to encourage feedback and ideas in the form of a “Dev Ops” chapter, acting as a subject matter expert
- Process: you’ll provide input on the best way to organise the team and drive documentation of our processes and responsibilities, engage with the strategic direction of the tribe and help drive roadmaps that achieve this progress
- Familiarity with AWS and experience creating/administering resources on the platform (we currently make heavy use of EKS, EC2, ECS, RDS and Elasti Cache)
- Good knowledge of data stores, their management and optimisation, (we’ve got Postgre
SQL, Elasticsearch, Redis and Dynamo
DB behind our tooling and apps) - Experience using or managing infrastructure‑as‑code, particularly Terraform
- Good Python and shell‑scripting knowledge in order to work with our tooling pipelines
- Proficient with containers and container orchestration (we currently use Docker containers running on EKS)
- Expertise in logging and monitoring at scale (S3, Graphite, Grafana, Elastic Search and Kibana)
- Knowledge of a Dev Ops toolchain to drive ownership of a self-hosted platform
- Competent in Git and the Git Ops philosophy
- Familiarity with concepts for managing very large application load (e.g. CDNs, load balancers) and providing high availability services
- Experience in an agile environment, working iteratively to solve problems
- Our Ways of Working: We all come into the office on Tuesdays and Thursdays
, with the option to work remotely or come into the office on the other days. We believe that in person collaboration and community spirit is super important, which is why we spend some of our time in the office and some of our time at home. - Time Off: In addition to the 8…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: