×
Register Here to Apply for Jobs or Post Jobs. X

System Software Engineer

Job in Carlsbad, Eddy County, New Mexico, 88221, USA
Listing for: Hoonify Technologies Inc.
Full Time position
Listed on 2026-06-22
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Systems Engineer, Cloud Computing: Infrastructure & Operations, IT Infrastructure
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Hoonify delivers secure, sovereign AI infrastructure designed for the next generation of inference workloads. Powered by TurbOS®, our platform enables organizations and Neo Cloud/data center operators to transform CPU/GPU infrastructure into production-ready AI environments—supporting local LLMs, agentic copilots, RAG, and embeddings. We empower teams with robust model lifecycle management, multi‑tenant controls, usage metering, and fully auditable operations.

The Role

We are seeking a System Software Engineer to help build, deploy, and operate our multi‑cloud computational platform and model‑serving infrastructure underpinning our AI/ML developer platform. This role focuses on implementation, automation, and day‑to‑day operation of production systems, working under the technical direction of senior engineers and the platform's established architectural patterns.

The successful candidate will deliver well‑engineered, well‑tested infrastructure changes, and grow their depth across Kubernetes, GPU‑backed workloads, observability, and continuous delivery in a production environment.

This role enables meaningful growth in cloud infrastructure, distributed systems, and ML serving. You will work directly with senior engineers on real production systems, receive code and design review on your work, and have a clear path to expand scope and ownership as your experience deepens.

Core Responsibilities
  • Implement and maintain Kubernetes workloads and supporting resources, including manifests, Helm charts, controllers, and configuration for networking, ingress, and storage, following established platform patterns.
  • Deploy and operate model‑serving workloads on GPU and accelerator node pools, including configuring autoscaling policies, resource requests and limits, and tenant‑specific deployment configurations.
  • Support model training and simulation workloads on distributed GPU systems.
  • Build and maintain instrumentation on Prometheus, Grafana, and Open Telemetry, including authoring dashboards, alerting rules, and trace and metric instrumentation for new services.
  • Implement and improve CI/CD pipelines, including build, test, and deployment automation, and contribute to progressive delivery practices already in use on the platform.
  • Develop and maintain infrastructure‑as‑code modules and automation scripts in support of repeatable, auditable infrastructure changes across cloud environments.
  • Support response to production incidents, execute documented runbooks, and contribute to postmortems and follow‑up remediation work.
  • Investigate and resolve issues across the stack, including container, node, network, and accelerator‑level problems, escalating appropriately when scope exceeds the role.
  • Write clear documentation, including runbooks, internal references, and design notes for the changes you ship.
  • Participate in code and design reviews, both as author and reviewer, and incorporate feedback from senior engineers into your work.
Required Qualifications
  • Bachelor's degree in Computer Science, Computer Engineering, or Information Technology, plus three (3) years relevant work experience or equivalent combination of education and relevant experience
  • Professional experience in cloud infrastructure, Dev Ops, site reliability, or backend engineering roles involving production system operation.
  • Working knowledge of Kubernetes in a production context, including writing and debugging manifests, understanding core resource types, and operating production workloads.
  • Hands‑on experience with at least one major cloud provider (e.g. AWS, GCP, or Open Stack), including its compute, networking, and identity primitives.
  • Experience instrumenting services and consuming observability data, including writing Prometheus queries, building Grafana dashboards, or working with distributed traces.
  • Familiarity with CI/CD systems and the basic mechanics of automated build, test, and deployment pipelines.
  • Experience in configuration management and infrastructure as code tools (e.g. Ansible, Puppet, and Helm)
  • Proficiency in at least one programming or scripting language used for infrastructure work (Python, Go, Rust, or Bash).
  • Comfort working in a Linux…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary