Infrastructure Engineer

19 Minutes ago • 3 Years + • $145,000 PA - $195,000 PA
Devops

Job Description

LangChain is seeking an Infrastructure Engineer to join their Infra team, focusing on developer productivity for LangGraph Cloud/Platform and LangSmith products. The role involves owning the end-to-end test strategy, setting up ephemeral test environments in Kubernetes, and enhancing quality practices in CI/CD. The engineer will also focus on observability for tests, performance, reliability, and incident workflows, contributing to making intelligent agents ubiquitous through reliable agent engineering platforms and open-source frameworks.
Good To Have:
  • Load/perf testing (k6)
  • Observability (Datadog, OpenTelemetry)
  • Property-based testing (Hypothesis)
  • Experience testing services running on Kubernetes and containers; comfortable with logs, events, and basic kubectl.
  • Infra awareness: Helm/Terraform basics, Kubernetes networking, and secrets management.
  • SQL fluency for data validation (Postgres/ClickHouse/BigQuery).
  • Go/Node/React familiarity for targeted white-box tests and testability improvements.
Must Have:
  • Own test strategy end-to-end across APIs, services, UI, data, and infra (K8s/Terraform/Helm).
  • Stand up ephemeral test environments in Kubernetes for PRs and release candidates; seed test data and run hermetic suites.
  • Shift-left quality in CI/CD (GitHub Actions): parallelization, caching, deterministic seeds, flake tracking, and quality gates.
  • Observability for tests: rich failure artifacts (videos, logs, traces), Datadog dashboards, and actionable alerts.
  • Performance & reliability: baseline SLIs/SLOs for critical paths; capacity tests and regression detection.
  • Partner on incident workflows: reproduce issues, add focused regression tests, and improve runbooks/postmortems.
  • Documentation: high-signal test plans, playbooks, and contributor guidelines for writing good tests.
  • 3+ years as Infra Engineer/Software Engineer focused on
  • Strong hands-on experience with Python (pytest)
  • Familiarity with CI/CD (GitHub Actions preferred) and making pipelines fast, parallel, and reliable.
  • Solid understanding of API testing, mocking/stubbing, and data setup/teardown.
  • Comfortable defining quality bars, authoring test plans, and driving cross-team execution.
Perks:
  • health and dental coverage
  • flexible vacation
  • a 401(k) plan
  • life insurance

Add these skills to join the top 1% applicants for this job

budget-management
github
data-structures
game-texts
api-testing
playwright
react
networking
terraform
helm
ci-cd
docker
kubernetes
python
sql
github-actions

About LangChain

At LangChain, our mission is to make intelligent agents ubiquitous. We provide the agent engineering platform and open source frameworks developers need to ship reliable agents fast.

Our open source frameworks, LangChain and LangGraph, see over 90+ million downloads per month and help developers build agents with speed and granular control. LangSmith offers observability, evaluation, and deployment for rapid iteration, enabling teams to transform LLM systems into dependable production experiences.

LangChain is trusted by millions of developers worldwide and powers AI teams at companies like Replit, Clay, Cloudflare, Harvey, Rippling, Vanta, Workday, and more.

About the role

In person 5 days/week in San Francisco, CA or New York, NY

We’re hiring a Software Engineer to join the Infra team and own developer productivity across our LangGraph Cloud/Platform and LangSmith products. You’ll work closely with Infra, Backend, and Frontend to ship with confidence across Kubernetes-based services, APIs, and UI flows—and you’ll help pioneer quality practices specific to LLM applications (e.g., prompt regressions and evaluation suites).

  • Own test strategy end-to-end across APIs, services, UI, data, and infra (K8s/Terraform/Helm).
  • Stand up ephemeral test environments in Kubernetes for PRs and release candidates; seed test data and run hermetic suites.
  • Shift-left quality in CI/CD (GitHub Actions): parallelization, caching, deterministic seeds, flake tracking, and quality gates.
  • Observability for tests: rich failure artifacts (videos, logs, traces), Datadog dashboards, and actionable alerts.
  • Performance & reliability: baseline SLIs/SLOs for critical paths; capacity tests and regression detection.
  • Partner on incident workflows: reproduce issues, add focused regression tests, and improve runbooks/postmortems.
  • Documentation: high-signal test plans, playbooks, and contributor guidelines for writing good tests.

Example projects you might own

  • A PR-ephemeral E2E harness that deploys a minimal LangSmith stack on Docker in CI and runs Playwright + API suites against seeded tenants.
  • A k6 scenario that simulates multi-tenant traffic with queue/backpressure, surfacing p95/p99 latency regressions per release.
  • A flake-budget system that auto-quarantines flaky tests, opens issues with artifacts, and tracks “time-to-deflake”.

How to be successful in this role

  • 3+ years as Infra Engineer/Software Engineer focused on
  • Strong hands-on experience with Python (pytest)
  • Familiarity with CI/CD (GitHub Actions preferred) and making pipelines fast, parallel, and reliable.
  • Solid understanding of API testing, mocking/stubbing, and data setup/teardown.
  • Comfortable defining quality bars, authoring test plans, and driving cross-team execution.

Bonus

  • Load/perf testing (k6), observability (Datadog, OpenTelemetry), and property-based testing (Hypothesis).
  • Experience testing services running on Kubernetes and containers; comfortable with logs, events, and basic kubectl.
  • Infra awareness: Helm/Terraform basics, Kubernetes networking, and secrets management.
  • SQL fluency for data validation (Postgres/ClickHouse/BigQuery).
  • Go/Node/React familiarity for targeted white-box tests and testability improvements.

Compensation & Benefits

  • We offer competitive compensation that includes base salary, meaningful equity, and benefits such as health and dental coverage, flexible vacation, a 401(k) plan, and life insurance. Actual compensation will vary based on role, level, and location. For team members in the EU and UK, we provide locally competitive benefits aligned with regional norms and regulations.
  • Annual salary range: $145,000-$195,000 USD for Senior Engineers

Set alerts for more jobs like Infrastructure Engineer
Set alerts for new jobs by LangChain
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙