Senior Site Reliability Engineer
Listed on 2026-06-23
-
IT/Tech
Cloud Computing: Infrastructure & Operations, SRE/Site Reliability, Systems Engineer
About the role
We are looking for a Senior Site Reliability Engineer (SRE) to join our Site Reliability Engineering Team, working closely with a dedicated product team to modernize infrastructure, strengthen system resilience, and scale our global platform, leveraging AI tools and agents to accelerate delivery and improve system quality.
Are you passionate about building modern, scalable cloud platforms that power real-world impact? At IDEXX, we are transforming how laboratory systems operate globally, helping veterinarians deliver better outcomes and enabling pets to live fuller lives. Our software supports Reference Laboratories, a critical area of IDEXX’s business, enabling high-volume diagnostic workflows, operational efficiency, and clinical insight at scale.
ResponsibilitiesIn this role you will be responsible for:
- Own the design and evolution of CI/CD pipeline architecture, governance, and standards.
- Modernize and automate deployment pipelines for Kotlin-based AWS Lambda services using Git Hub Actions.
- Standardize infrastructure and deployment processes across services, reducing manual deployment effort through automation.
- Leverage AI tools where appropriate to improve productivity and system quality.
- Design, build, and evolve scalable, resilient AWS cloud infrastructure: lead the implementation of disaster recovery, high availability, and fault‑tolerant designs; automate infrastructure provisioning and lifecycle management.
- Build and maintain end-to-end observability (metrics, logging, tracing, alerting); establish effective alerting that reduces noise; proactively identify and address system risks; lead incident response in shared on‑call rotation; drive root cause analysis and blameless post‑mortems.
- Own and govern the release process, including deployment gates and approvals; review and approve deployment plans; optimize the build and release lifecycle for speed, consistency, and reliability; manage cross‑repo dependencies and versioning strategies.
- Lead remediation of security vulnerabilities, collaborate with Security team, establish processes to proactively prevent new security risks; embed secure development and deployment practices into pipelines.
- Guide the development team toward reliability and security best practices; proactively identify issues, drive visibility, and ensure timely resolution; stay up to date with industry trends and emerging technologies; communicate technical concepts clearly to both technical and non‑technical stakeholders.
- 7+ years of experience in Dev Ops, SRE, Platform Engineering, or similar roles focused on CI/CD, cloud infrastructure, and system reliability.
- Strong experience with AWS Serverless architectures, Terraform and Cloud Formation, CI/CD pipelines (Git Hub Actions preferred), Azure Entra , OAuth2, OpenID Connect, Maven build tooling, Git-based version control workflows (Git Hub preferred).
- Proven ability to design and optimize deployment pipelines.
- Troubleshoot complex distributed systems.
- Make data‑driven decisions.
- Translate business requirements into scalable technical solutions.
- Strong communication, collaboration, and organizational skills.
- Understanding of system design patterns for reliability and scalability.
- Experience with Kotlin or Java development.
- Experience with No
SQL databases such as DynamoDB and relational databases such as PostgreSQL. - Experience working in Agile or Scrum environments.
- Familiarity with artifact management tools such as JFrog Artifactory.
- Experience defining and managing SLAs, SLOs, and SLIs.
- Experience with distributed tracing tools such as AWS X‑Ray or Open Telemetry.
- Experience using AI tools or AI agents to improve development, automation, or operational workflows.
Cloud: AWS Serverless (Lambda, SQS, SNS, Event Bridge, DynamoDB, S3, AuroraDB, Cloud Front, etc.)
Infrastructure as Code:
Terraform, AWS Cloud Formation.
CI/CD:
Git Hub Actions, Maven.
Version Control:
Git, Git Hub.
Languages:
Kotlin (primary), Java, Python, Type Script.
We prefer those who can drive to our Westbrook, Maine location and will be on site 8 days per month. We also consider candidates further away in NH or MA who…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).