×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability Engineer; Platform Engineering

Remote / Online - Candidates ideally in
Texas, USA
Listing for: Landbot
Full Time, Remote/Work from Home position
Listed on 2025-12-08
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Senior Site Reliability Engineer (Platform Engineering)

Join to apply for the Senior Site Reliability Engineer (Platform Engineering) role at Landbot

Full time. Remote position. While this is a remote position, we are currently only considering candidates between UTC-1 to UTC +2.

About Landbot

Operating in more than 150 countries, Landbot offers a platform that helps companies to create exceptional chatbot and AI agent conversations across several channels as Web, Whats App, and Messenger. At Landbot
, we’re building a high-performance team that blends engineering excellence, product mindset, and customer obsession
. We believe quality and speed go hand in hand — and we’re looking for a Senior Reliability Engineer to help us scale our platform and deliver real impact.

About The Team

You’ll join our Platform Engineering team, a small, focused group responsible for building and maintaining Landbot Engineering Platform, Data Platform, and Security. Our mission is to empower Landbot teams to deliver value faster, more reliably, and at scale.

Core Team Values
  • Platform-as-product mindset
  • Autonomy and ownership
  • Collaboration over gatekeeping
The Role

As Senior Reliability Engineer you will:

Build and Maintain the Internal Developer Platform
  • Design and implement core platform services (CI/CD pipelines, infrastructure provisioning, and observability systems).
  • Design and implement developer-facing tools, APIs, and automation that enable application teams to deploy, scale, and operate services independently.
Define and Maintain Platform Operations
  • Manage and optimize cloud resources, Kubernetes clusters, databases, and networking for reliability, scalability, and cost optimization.
  • Establish SLIs, SLOs, and error budgets to balance reliability with feature velocity.
  • Design and maintain observability solutions for real-time visibility and proactive issue detection.
  • Implement alerting strategies that reduce noise and focus on actionable signals.
  • Lead incident response, conduct blameless postmortems and drive continuous improvement.
Enhance Developer Experience and Drive Platform Strategy
  • Partner with application teams (platform customers) to understand their workflows and pain points, gather feedback, and prioritize improvements aligned with business objectives.
  • Create and maintain documentation, runbooks, and knowledge bases that reduce knowledge silos and enable self-service.
  • Drive decisions through written formats (RFCs, ADRs) that document architectural choices.
  • Measure platform success through developer productivity metrics, adoption rates, and toil reduction.
Experience
  • 3-5 years experience in Site Reliability Engineering, Platform Engineering, Infrastructure Engineering, or Dev Ops roles, or as a full-time freelancer in similar roles.
  • Experience reducing operational toil through automation and self-service tooling.
  • Experience building internal platforms or developer tooling, or enabling platform capabilities from application teams, with a platform-as-product mindset focused on developer experience.
  • Experience managing production infrastructure and establishing reliability practices (SLIs/SLOs, observability, incident response).
Technical Skills
  • Strong working knowledge of Kubernetes and the container ecosystem
  • Experience with cloud platforms (GCP, AWS, Azure)
  • Proficiency with Infrastructure as Code tools.
  • Knowledge of Kubernetes manifest management tools and Git Ops practices.
  • Experience with Observability platforms. Knowledge of Open Telemetry is a plus.
  • Good skill in shell scripting. Experience with Python or Go is a plus.
  • Experience in Linux, databases management, networking, and distributed systems.
  • Solid knowledge of CI/CD pipelines.
  • Ability to work effectively in paired/mob programming and asynchronous work environments.
Nice to Have
  • Experience with database performance tuning, query optimization, replication strategies, and database scaling in production environments.
  • Familiarity with security best practices in cloud-native environments.
  • Experience with data platforms, data pipelines, and data infrastructure (data warehouses, data lakes, ETL/ELT processes, streaming data platforms).
  • Experience supporting AI workloads and infrastructure (LLM platforms, AI agents, vector databases, AI…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary