×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Xona Space Systems
Full Time position
Listed on 2026-06-17
Job specializations:
  • IT/Tech
    Cloud Computing: Infrastructure & Operations, Systems Engineer, SRE/Site Reliability, IT Support
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Requirements

  • Cloud Operations: 4+ years of experience managing production-grade environments in AWS, GCP, or Azure
  • Orchestration:
    Expert-level proficiency with Kubernetes (EKS), including networking, ingress controllers, and service mesh management
  • Automation:
    Strong experience with configuration management and IaC (e.g., Terraform, Ansible, Helm)
  • Data Systems:
    Deep knowledge of SQL and No

    SQL database administration, focusing on replication, backup, and disaster recovery
  • Programming:
    Proficiency in Python and C++ for developing internal tooling and automating complex operational workflows
  • Systems Internals:
    Strong understanding of Linux networking, storage, and kernel tuning
  • (Desirable) Prior experience in Aerospace, Defense, or high-reliability sectors
  • (Desirable) Familiarity with CCSDS standards or satellite ground station software
  • (Desirable) Experience with secure, air-gapped, or hybrid-cloud deployments
What the job involves
  • We are seeking a Site Reliability Engineer (SRE) to architect and manage the critical ground infrastructure for our satellite constellation. This role is responsible for the "last mile" of mission success: ensuring that the software controlling our orbital assets is highly available, scalable, and seamlessly integrated with Mission Operations
  • You will own the lifecycle of our production environments, from automating deployments via Infrastructure as Code (IaC) to managing the core data systems that track constellation health and user activity
  • Infrastructure as Code (IaC):
    Design and maintain scalable, repeatable cloud infrastructure (AWS) using tools like Terraform or Cloud Formation
  • Mission Ops Integration:
    Build and optimize the interfaces between core data management systems and Mission Operations software, ensuring reliable telemetry and command flows
  • User & Data Management:
    Architect and maintain high-availability identity providers (IdP) and distributed databases to support global user access and real-time data processing
  • Automated Deployment Pipelines:
    Create and manage robust CI/CD pipelines to deploy containerized applications into production with a focus on zero-downtime and rollback capabilities
  • Observability & Reliability:
    Implement comprehensive monitoring, alerting, and logging (e.g., Prometheus, Grafana, ELK) to ensure 99.99% uptime for ground segment services
  • Scalability Engineering:
    Perform capacity planning and performance tuning to handle the high-throughput data requirements of a growing satellite constellation
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary