×
Register Here to Apply for Jobs or Post Jobs. X

Senior SRE Engineer – Cloud Operations

Remote / Online - Candidates ideally in
California, Moniteau County, Missouri, 65018, USA
Listing for: Core Talent Finder
Remote/Work from Home position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Systems Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Location: California

Senior SRE Engineer – Cloud Operations

Remote – Americas
Full-time

We are recruiting on behalf of a fast-growing AI infrastructure company that builds a high-performance vector database powering semantic search, RAG pipelines, AI agents, and large-scale machine learning applications.

We are seeking a Senior Site Reliability Engineer (SRE) to join the Cloud Operations team and help ensure reliability, observability, and operational excellence across production cloud environments.

This role is highly operations-focused and ideal for engineers who enjoy owning system reliability, improving automation, and operating large-scale distributed systems in production.

About the Role

As a Senior SRE, you will be responsible for maintaining and improving production infrastructure while reducing operational risk and improving system reliability at scale.

You will work closely with platform engineering and infrastructure teams to ensure systems remain secure, performant, and highly available as customer usage grows.

Location Requirements
  • Remote – Americas (North, Central, or South America)

  • Candidates must be able to work primarily within American time zones

Key Responsibilities

Cloud Infrastructure & Operations

  • Operate and maintain production cloud infrastructure at scale

  • Manage Kubernetes clusters, networking, and deployment pipelines

  • Improve reliability, performance, and security of production systems

Monitoring & Observability

  • Enhance monitoring, logging, and alerting systems

  • Improve operational visibility and incident detection

Incident Response & Reliability

  • Lead incident response and root cause analysis

  • Implement preventive measures and continuous reliability improvements

  • Participate in on-call rotations

Automation & Process Improvement

  • Reduce operational toil through automation and tooling

  • Maintain and improve runbooks and operational procedures

Collaboration

  • Work closely with platform engineering and infrastructure teams

  • Support scalable architecture and operational best practices

Requirements
  • 5+ years of experience in Dev Ops, SRE, or infrastructure operations

  • Strong hands-on experience running Kubernetes in production

  • Solid understanding of:

    • Linux systems

    • Networking fundamentals

    • Cloud infrastructure (AWS, GCP, or Azure)

  • Experience with monitoring, alerting, and incident management

  • Experience with infrastructure automation or infrastructure-as-code

  • Comfortable participating in on-call rotations

  • Strong communication and problem-solving skills

Preferred Qualifications
  • Experience with Terraform or similar IaC tools

  • Familiarity with Prometheus, Grafana, Loki, or Open Telemetry

  • Scripting experience in Python, Bash, or Go

  • Experience in SaaS, cloud platforms, or data infrastructure environments

  • Exposure to security, compliance, or system hardening

Whats Offered
  • Competitive compensation and benefits

  • Fully remote work environment

  • Flexible working hours

  • Opportunity to work on mission-critical cloud infrastructure

  • Collaborative, engineering-driven culture

How to Apply

If you are passionate about reliability engineering, cloud infrastructure, and large-scale distributed systems
, we would love to hear from you.

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary