Infrastructure Engineer
Listed on 2026-02-12
-
IT/Tech
Systems Engineer, Cloud Computing
About Us
At Kalderos, we are building unifying technologies that bring transparency, trust, and equity to the entire healthcare community with a focus on pharmaceutical pricing. Our success is measured when we can empower all of healthcare to focus more on improving the health of people.
That success is driven by Kalderos' greatest asset, our people. Our team thrives on the problems that we solve, is driven to innovate, and thrives on the feedback of their peers. Our team is passionate about what they do and we are looking for people to join our company and our mission.
That's where you come in!
What You'll DoWe are looking for a collaborative Staff Infrastructure Engineer
. The ideal candidate should have a strong inclination to work in rapidly developing and expanding organizations and possess the necessary background to do so. You are well-acquainted with the fast‑paced, high‑volume, and uncertain nature of operations in the organization, and perceive it as a chance to deliver significant outcomes. Across all roles, we look for future team members who will live our values of being BOLD (Bias to Action, One Team, Lead by Example, Dumpster Diving Data Mavens).
- Own critical infrastructure and support platform domains end to end.
- Set direction and standards for reliability, security, and developer experience.
- Lead complex, cross‑team initiatives and architecture decisions.
- Act as a force multiplier by mentoring and unblocking other engineers.
This is a full‑time Staff Infrastructure Engineer role, which can be based in Chicago, IL or Boston, MA. Relocation assistance will not be provided.
Key Responsibilities Architecture & Strategy- Define and evolve infrastructure and support platform architecture for key products and services.
- Drive a multi‑year roadmap for reliability, security and support platform capabilities.
- Establish reference architectures and golden paths that product and data teams can adopt.
- Lead adoption of SRE practices: SLOs/SLIs, error budgets, incident management, and post‑incident reviews.
- Design and maintain observability (metrics, logs, traces, alerts, dashboards) across services and infrastructure.
- Automate runbooks and self‑healing mechanisms to reduce toil and improve MTTR.
- Design and evolve a self‑service platform (environments, CI/CD, infra templates) that helps teams ship safely and quickly.
- Own and improve Infrastructure as Code (IaC) and Git Ops workflows.
- Build reusable platform components and tools that standardize how services are built, deployed, and operated.
- Partner with Security and Compliance to embed zero‑trust, least privilege, and encryption into infrastructure patterns.
- Support healthcare‑grade compliance (e.g., SOC 2, HIPAA, HITRUST) through infrastructure design and automation.
- Collaborate with data teams on secure, reliable data platforms and pipelines.
- Lead Fin Ops practices: visibility into spend, cost optimization, and guardrails that balance cost and reliability.
- Design for resilience and recovery: backup/restore, DR, and regional failover strategies.
- Continuously tune performance and capacity across compute, storage, and networking.
- Cloud & Platform:
Deep experience with at least one major cloud provider, Azure preferred (AWS, Azure, or GCP) at an architectural level; proven ability to design and operate highly available, secure, cloud‑native systems. - Containers & Orchestration:
Hands‑on experience with Kubernetes or similar, plus key ecosystem tools (ingress, service mesh, operators). - Infrastructure as Code & Automation:
Expert with IaC (Terraform, Cloud Formation, Pulumi, etc.), modular design, and automation using Python, Go, or similar. - CI/CD & Delivery:
Experience designing and maintaining CI/CD pipelines (e.g., Git Hub Actions, Git Lab CI, Circle
CI, Azure Dev Ops), including progressive delivery (blue/green, canary, feature flags) and safe rollback. - Observability & SRE:
Strong understanding of SRE principles and hands‑on work with observability stacks (e.g., Prometheus, Grafana, Open Telemetry, ELK/EFK, Datadog, New Relic). - Se…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).