×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer II

Job in Cambridge, Middlesex County, Massachusetts, 02142, USA
Listing for: Akamai
Full Time position
Listed on 2026-06-03
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Job Description & How to Apply Below
Job Description

Are you passionate about cutting-edge AI infrastructure?

Do you want to build your SRE career on one of the most exciting platforms in cloud computing?

Join the Akamai Inference Cloud Team

The Akamai Inference Cloud team is part of Akamai's Cloud Technology Group. We design, implement, deploy and operate AI platforms that enable customers to run inference models and developers to create AI applications.

Partner with the best

In this role, responsibilities will include automation, monitoring, incident response, and working collaboratively with skilled team members. Candidates should possess expertise in Linux systems, automation, and SRE practices. Daily activities involve coding, improving dashboards, enhancing alerts, and minimizing repetitive tasks. Opportunities exist to focus on GPU infrastructure, Kubernetes, and ensuring reliability for AI workloads within Akamai's serverless inference platform.

As an Site Reliability Engineer II, you will be responsible for:
  • Building and maintaining dashboards, alerts, and monitoring for inference workloads using Akamai's existing observability platform
  • Writing automation and tooling in Python or Go to reduce operational toil and improve system reliability
  • Building and improving runbooks for inference-specific operational procedures, integrating into Akamai's existing incident management processes
  • Contributing to SLO tracking and reporting, identifying trends and areas for improvement
  • Supporting CI/CD pipeline maintenance, deployment safety checks, and rollback procedures
  • Collaborating with product engineering teams to troubleshoot complex problems across the stack
  • Participating in on-call rotations, responding to production incidents, and conducting blameless post-mortems
Do what you love

To be successful in this role you will:
  • Have 2+ years of experience in Site Reliability Engineering and a Bachelor's Degree or its equivalent experience
  • Demonstrate coding ability in at least one programming language (Python or Go) with experience writing automation
  • Have experience with Linux systems administration and the ability to troubleshoot complex infrastructure issues
  • Show familiarity with Kubernetes and containerization concepts
  • Have experience with monitoring and observability tools such as Prometheus, Grafana, or similar
  • Have exposure to CI/CD pipelines and infrastructure-as-code tools (Terraform, Salt Stack, or equivalent)
  • Show a willingness to learn and grow, with genuine curiosity about AI infrastructure and distributed systems
Work in a way that works for you

Flex Base, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. Flex Base, gives 95% of employees the choice to work from their home, their office, or both (in the country advertised).

This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.

Learn what makes Akamai a great place to work

Connect with us on social and see what life at Akamai is like!

We power and protect life online, by solving the toughest challenges, together.

At Akamai, we're curious, innovative, collaborative and tenacious. We celebrate diversity of thought and we hold an unwavering belief that we can make a meaningful difference. Our teams use their global perspectives to put customers at the forefront of everything they do, so if you are people-centric, you'll thrive here.

Working for you

At Akamai, we will provide you with opportunities to grow, flourish, and achieve great things. Our benefit options are designed to meet your individual needs for today and in the future. We provide benefits surrounding all aspects of your life:
  • Your health
  • Your finances
  • Your family
  • Your time at work
  • Your time pursuing other endeavors
Our benefit plan options are designed to meet your individual…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary