SRE/DevOps Engineer Job Jamestown area,Town of Poland New York USA,IT/Tech

Position: SRE / DevOps Engineer
Location: Town of Poland

Engagement context

Takeover of production AI mobile coaching platform. Runtimes:
Node.js/NestJS, Python/FastAPI. Data stores:
MongoDB, Postgres, Redis. Infra: AWS (ECS/EKS, RDS, Elasti Cache, S3, VPC, IAM). CI:
Git Hub Actions. Observability:
Datadog. Push:
One Signal. Errors:
Crashlytics. Deep links:
Branch. Vendors:
Auth0, Eleven Labs, OpenAI, Amplitude, Terra, Strava.

Role summary

Senior SRE/Dev Ops. Owner: CI→production, IaC, deploy automation, observability, on-call, cost control, secrets, security baselines. Phase 1: measure and document. Phase 2: operate and transfer ownership.

First 90 days

Audit CI/CD (Git Hub Actions): duration, flakiness, failure modes, secrets handling
Audit AWS: ECS/EKS topology, IAM posture, VPC layout, RDS, Elasti Cache, S3
Audit Datadog: dashboards, tracked metrics, SLO/SLI gaps
Audit incidents (12m): count, severity, MTTR, RCA patterns
Vendor inventory:
Auth0, One Signal, Eleven Labs, OpenAI, Branch, Amplitude, Terra, Strava, Crashlytics — owners, billing, MFA, recovery plans

Ongoing

Own CI/CD across services
Own AWS infra (Terraform/Pulumi where suitable)
Cost control (OpenAI token spend, AWS rightsizing)
Security baselines: least-privilege IAM, secrets rotation, dependency scanning
Build onboarding for second SRE/Dev Ops hire

Kogo poszukujemy? Must-have skills

5+ years SRE / Dev Ops / Platform Engineering in production
AWS at depth — ECS or EKS, IAM (assume-role patterns, scoped policies), VPC, RDS, Elasti Cache, S3, Cloud Watch
Infrastructure-as-code — Terraform (preferred) or Pulumi
Git Hub Actions — building reusable workflows, secret handling, reproducible builds
Container fundamentals — Docker file authoring, multi-stage builds, image hardening
Linux operations
Datadog in production — logs, APM, metrics, dashboards, monitors, SLO/SLI definition
Incident response — leading or co-leading real production incidents, writing post-mortems
Observability for both Node.js and Python services
Secrets management — AWS Secrets Manager, SOPS, or comparable
Working English

Nice-to-haves

Cost-optimisation discipline (Fin Ops, AWS Cost Explorer, Reserved Instance planning)
LLM cost monitoring (per-route OpenAI token spend dashboards)
Kubernetes specifically (we may or may not be on EKS)
Security baseline experience — CIS benchmarks, dependency scanning (Snyk, Dependabot), SAST tools
GDPR / data-residency considerations for cross-border data flows (US PL)
Mobile CI considerations — Fastlane, app-signing automation, Test Flight / Google Play internal tracks
On-call playbook authoring

#J-18808-Ljbffr

SRE​/DevOps Engineer

SRE/DevOps Engineer