×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer, Kubernetes Platform; Starshield

Job in Redmond, Larimer County, Colorado, USA
Listing for: SpaceX
Full Time position
Listed on 2026-07-04
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing: Infrastructure & Operations, SRE/Site Reliability, Unix/Linux
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below
Position: Site Reliability Engineer, Kubernetes Platform (Starshield)
Location: Redmond

SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.

SITE RELIABILITY ENGINEER, KUBERENTES PLATFORM (STARSHIELD)

At SpaceX we’re leveraging our experience in building rockets and spacecraft to deploy the Starshield constellation. Starshield is the world’s largest US government satellite constellation and is tasked with providing immediate access to critical intelligence and national security data for the US government anywhere on the globe. We design, build, test, and operate all parts of the system – receivers that allow users to connect within minutes, and the software that brings it all together.

We’ve only begun to scratch the surface of Starshield's global impact and are looking for best-in-class engineers to help us further our ambitious goals.

As an engineer focused on Starshield's software and network infrastructure, you will design, operate and scale the infrastructure we use to run the world’s largest government satellite constellation. These positions cover a variety of areas ranging from Site Reliability Engineering, Developer Operations, and our internal Kubernetes platforms. You will develop automation to deploy and manage on-premise compute resources, create highly scalable and maintainable software products, and directly collaborate with engineering across the board.

RESPONSIBILITES:
  • Develop automation to deploy and manage on-premise Kubernetes clusters
  • Deploy and manage core infrastructure such as databases, monitoring and distributed storage
  • Closely collaborate with software engineers to create highly scalable, operable, and maintainable products
  • Engage in and improve the whole lifecycle of services -- from inception and design, through deployment, operation and refinement
  • Monitoring and alerting supporting systems to have high availability
  • Hands‑on integration and troubleshooting across the entire Starshield stack
  • Identify areas for improvement and create innovative solutions that enable high system availability
BASIC QUALIFICATIONS:
  • Bachelor’s degree in computer science, information systems/IT, or an engineering discipline and 1+ years of professional experience in site reliability engineering or Dev Ops; OR 3+ years of professional experience in site reliability engineering or Dev Ops in lieu of a degree
  • 1+ years of professional experience with Linux operating systems
  • Experience with Terraform, Ansible, or other infrastructure tools
  • Experience with containerization technologies (i.e. OCI containers, Kubernetes)
  • Experience scripting in Bash, Python, or other similar languages
  • Development experience in Python, C++, or Go
PREFERRED

SKILLS AND EXPERIENCE:
  • 1+ years of experience with Python and Python-based development frameworks
  • Experience managing Kubernetes clusters, not just using them
  • Knowledge of Linux boot process and systems configuration
  • Deep understanding of testing, continuous integration, build, deployment & continuous monitoring
  • Understanding of relevant build technologies, such as Bazel and Makefiles
  • Focus on performance bottlenecks and performance improvement techniques
  • Understanding of distributed databases and data modeling
  • Experience with automatically managing dozens, hundreds, or thousands of servers (eg: Terraform or Ansible)
  • Strong networking knowledge of TCP/IP
  • Excellent communications skills with the ability to communicate with customers, peers, management etc. in both formal and informal situations
  • Active Top Secret, Top Secret SCI, or DOE Level Q clearance
ADDITIONAL REQUIREMENTS:
  • Must be willing to work extended hours and weekends as needed
  • This position requires successfully obtaining and maintaining a Top Secret Security Clearance as a condition of employment. While the clearance may not be immediately necessary upon hire, we encourage you to initiate the application process promptly upon accepting this offer. Your ability to secure the necessary clearance is essential for fulfilling key responsibilities of the role. Should you be unable to obtain it,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary