Senior Software Engineer, Observability
Listed on 2026-03-01
-
IT/Tech
Systems Engineer, Cybersecurity
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability.
Be a part of the AI revolution with sustainable technology e, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.
About This Role:We’re seeking a Senior Software Engineer to play a key role on our Observability team within the Cloud Infrastructure organization. This team owns the real-time observability platforms that underpin visibility, reliability, and operational insight across our cloud and data center infrastructure.
What You’ll Be Working On:Maintain and manage core observability tools, including platforms for metrics, events, logs and tracing.
Develop and operate data pipelines to move telemetry data from various sources to backend storage.
Manage large-scale data ingestion and storage requirements for high-volume environments.
Perform regular updates and software enhancements to ensure system stability and security.
Participate in a standard on-call rotation to address production issues and perform root cause analysis.
Work with other engineering teams to implement monitoring best practices and standardized tooling.
Contribute to the long-term technical roadmap for the company's internal infrastructure.
5+ years of experience in software or systems engineering.
Proficiency in Java or Go or Python for writing production-level code.
Practical experience managing Kubernetes clusters in a production environment.
Experience deploying and managing services using Helm and YAML-based configurations.
Ability to troubleshoot and resolve issues within distributed system architectures.
Experience participating in an on-call rotation for business-critical systems.
Experience with common observability tools such as Prometheus, Grafana, Loki, Click House or Elasticsearch.
Familiarity with Kafka or similar message queuing systems.
Experience using Terraform for infrastructure provisioning.
Knowledge of Open Telemetry standards.
Familiarity with GPU-based infrastructure or machine learning workloads.
Industry competitive pay
Restricted Stock Units in a fast growing, well-funded technology company
Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
Employer contributions to HSA accounts
Paid Parental Leave
Paid life insurance, short-term and long-term disability
Teladoc
401(k) with a 100% match up to 4% of salary
Generous paid time off and holiday schedule
Cell phone reimbursement
Tuition reimbursement
Subscription to the Calm app
Met Life Legal
Company paid commuter benefit; $300/month
Compensation will be paid in the range of up to $172,000 -$209,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.
Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).