Infrastructure Engineer
Listed on 2025-12-02
-
Software Development
DevOps, Cloud Engineer - Software
Infrastructure Engineer role at Lang Chain
About Lang ChainAt Lang Chain, our mission is to make intelligent agents ubiquitous. We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast. Our open source frameworks, Lang Chain and Lang Graph, see over 90+ million downloads per month and help developers build agents with speed and granular control. Lang Smith offers observability, evaluation, and deployment for rapid iteration, enabling teams to transform LLM systems into dependable production experiences.
Lang Chain is trusted by millions of developers worldwide and powers AI teams at companies like Replit, Clay, Cloudflare, Harvey, Rippling, Vanta, Workday, and more.
In person, 5 days/week in San Francisco, CA or New York, NY
Role SummaryWe’re hiring a Software Engineer to join the Infra team and own developer productivity across our Lang Graph Cloud/Platform and Lang Smith products. You’ll work closely with Infra, Backend, and Frontend to ship with confidence across Kubernetes-based services, APIs, and UI flows—and you’ll help pioneer quality practices specific to LLM applications (e.g., prompt regressions and evaluation suites).
Responsibilities- Own test strategy end-to-end across APIs, services, UI, data, and infra (K8s/Terraform/Helm).
- Stand up ephemeral test environments in Kubernetes for PRs and release candidates; seed test data and run hermetic suites.
- Shift-left quality in CI/CD (Git Hub Actions): parallelization, caching, deterministic seeds, flake tracking, and quality gates.
- Observability for tests: rich failure artifacts (videos, logs, traces), Datadog dashboards, and actionable alerts.
- Performance & reliability: baseline SLIs/SLOs for critical paths; capacity tests and regression detection.
- Partner on incident workflows: reproduce issues, add focused regression tests, and improve runbooks/postmortems.
- Documentation: high-signal test plans, playbooks, and contributor guidelines for writing good tests.
- A PR-ephemeral E2E harness that deploys a minimal Lang Smith stack on Docker in CI and runs Playwright + API suites against seeded tenants.
- A k6 scenario that simulates multi-tenant traffic with queue/back pressure, surfacing p95/p99 latency regressions per release.
- A flake-budget system that auto-quarantines flaky tests, opens issues with artifacts, and tracks "time-to-deflake".
- 3+ years as Infra Engineer/Software Engineer focused on ...
- Strong hands-on experience with Python (pytest)
- Familiarity with CI/CD (Git Hub Actions preferred) and making pipelines fast, parallel, and reliable.
- Solid understanding of API testing, mocking/stubbing, and data setup/teardown.
- Comfortable defining quality bars, authoring test plans, and driving cross-team execution.
- Load/perf testing (k6), observability (Datadog, Open Telemetry), and property-based testing (Hypothesis).
- Experience testing services running on Kubernetes and containers; comfortable with logs, events, and basic kubectl.
- Infra awareness:
Helm/Terraform basics, Kubernetes networking, and secrets management. - SQL fluency for data validation (Postgres/Click House/Big Query).
- Go/Node/React familiarity for targeted white-box tests and testability improvements.
- We offer competitive compensation that includes base salary, meaningful equity, and benefits such as health and dental coverage, flexible vacation, a 401(k) plan, and life insurance. Actual compensation will vary based on role, level, and location. For team members in the EU and UK, we provide locally competitive benefits aligned with regional norms and regulations.
- Annual salary range: $145,000-$195,000 USD for Senior Engineers
- Mid-Senior level
- Full-time
- Information Technology
- Technology, Information and Internet
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).