×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer; Space Communications

Job in Torrance, Los Angeles County, California, 90504, USA
Listing for: Northwood Space
Full Time position
Listed on 2025-12-28
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, SRE/Site Reliability, Network Engineer
Salary/Wage Range or Industry Benchmark: 108000 - 140000 USD Yearly USD 108000.00 140000.00 YEAR
Job Description & How to Apply Below
Position: Site Reliability Engineer (Space Communications)

Location

Los Angeles, CA

Employment Type

Full time

Location Type

On‑site

Department

Software

Compensation
  • $108K – $140K
    • Offers Equity

Compensation at Northwood Space is based on role, level, location, and alignment with market data. Individual base pay is determined on a case‑by‑case basis and may vary depending on job‑related skills, education, experience, and technical expertise. In addition to base salary, Northwood Space offers long‑term incentives such as company stock options and discretionary performance bonuses. Benefits include equity, comprehensive health care, flexible vacation, retirement savings plans, and opportunities for professional development.

About

Northwood

Northwood is on a mission to transform connectivity between earth and space and bring the benefits of space to the masses through innovations in space communications technologies. If you like building quickly and seeing your work deployed in locations around the globe with real impact, we want you at Northwood.

Role

Northwood is looking for an Infrastructure Engineer to help build and maintain our observability infrastructure and ensure our global space communications network operates reliably. As we rapidly scale our operations and establish ground stations around the world, we need someone who can grow with us while building robust monitoring and logging systems and supporting our development teams with reliable CI/CD pipelines.

You’ll be responsible for building and maintaining our observability and monitoring infrastructure, while working closely with engineering teams to improve system reliability and deployment processes. This role offers significant growth opportunities as we scale, and you’ll collaborate with experienced engineers to establish monitoring best practices and incident response procedures. We’re seeking someone with 2-4 years of experience who thrives in a fast‑paced startup environment and is excited to take on diverse infrastructure challenges.

Responsibilities
  • Build and maintain observability stack with tools like Grafana, Prometheus, Loki, Vector, Cloud Watch, Victoria Metrics, etc. for metrics and log ingestion across environments
  • Support and improve CI/CD pipelines using Git Lab and ArgoCD, collaborating with development teams on deployment best practices
  • Help build and maintain cloud infrastructure using Terraform on AWS, contributing to the scalability and reliability of our space communication systems
  • Work with senior engineers to establish monitoring strategies, alerting, and incident response procedures
  • Deploy and manage Kubernetes applications using Helm charts, with focus on reliability and developer experience
  • Collaborate with engineering teams to implement performance monitoring and troubleshooting across microservices
  • Support identity and access management integration with Okta and Hashi Corp Vault
  • Assist in managing NixOS‑based infrastructure for reproducible system configurations
  • Participate in incident response efforts and contribute to post‑incident reviews and improvements
Basic Qualifications
  • 2-4 years of hands‑on experience with infrastructure tools and monitoring systems in production environments
  • Experience with containerization (Docker, Kubernetes) and basic container orchestration
  • Familiarity with CI/CD tools (Git Lab, Jenkins, or similar) and infrastructure as code concepts
  • Experience with cloud platforms (AWS preferred) and basic infrastructure automation
  • Programming skills in Python or similar language and experience with configuration management
  • Startup mentality with ability to work in fast‑paced, high‑growth environments and take on diverse responsibilities
  • Experience with logging and metrics collection for production systems
  • Understanding of system reliability principles and interest in learning SRE practices
Preferred Qualifications
  • Some exposure to observability tools like Vector, Loki, Grafana, Prometheus, or similar monitoring systems
  • Experience with Terraform or other infrastructure as code tools
  • Familiarity with NixOS or other declarative system configuration approaches
  • Basic knowledge of Hashi Corp Vault, Okta, or similar identity/secrets management tools
  • Interest in distributed systems and…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary