Software Development Engineer, ElastiCache
Listed on 2026-06-03
-
Software Development
Software Engineer
Description
This is an opportunity to join one of AWS's most foundational and high-impact engineering teams — the In-Memory Computing Platform team, part of Amazon Elasti Cache. We build the next-generation, high-performance in-memory distributed data storage platform that powers some of the world's most demanding real-time applications. Our work sits at the intersection of distributed systems, database internals, and cloud-scale infrastructure, and it directly shapes how millions of AWS customers build low-latency, high-throughput applications.
If you've ever found yourself deep in a conversation about CAP theorem, consistent hashing, Paxos, or gossip protocols — and you want to apply those ideas to real-world systems at massive scale — this team is where you belong. We are the engineers behind the acclaimed Amazon Dynamo paper, and we continue to push the boundaries of what No
SQL systems can do. We're not just building a cache; we're building a durable, highly available, and scalable in-memory database platform that bridges the best of RDBMS and No
SQL worlds.
As a Software Development Engineer on this team, you will take on broad ownership across the full lifecycle of our platform. Your core responsibilities will include:
- Designing and building the next-generation in-memory No
SQL database platform, enabling developers to create highly available, scalable, and high-performance applications at unprecedented scale. - Leading software development of large-scale distributed in-memory storage systems, primarily in Java and C/C++, leveraging open-source technologies such as Redis and Memcached alongside Amazon-proprietary technologies.
- Developing and operating HTTP/REST services, asynchronous messaging systems, and event-driven architectures that form the backbone of our platform.
- Building and improving real-time failure detection and auto-remediation systems capable of detecting node failures in large distributed clusters and initiating recovery within seconds.
- Driving horizontal and vertical scaling capabilities, management and monitoring plane workflows, fault tolerance mechanisms, and backup and restore technologies.
- Contributing to disaster recovery and prevention strategies to ensure the highest levels of availability and durability for our customers.
- Mentoring and growing junior engineers on the team, serving as a technical leader and role model for engineering best practices.
- Managing individual project priorities, deadlines, and deliverables with a high degree of autonomy and accountability.
Day-to-day, you can expect a dynamic mix of deep technical work and collaborative engineering. A typical week might look like:
Writing and reviewing production-quality code in Java or C/C++ for distributed storage components, scaling systems for monitoring plane owned services.
Participating in design reviews and architecture discussions, where you'll debate tradeoffs around consistency, availability, and partition tolerance — and then go build the solution.
Collaborating with peer engineers across the team to debug complex distributed systems issues, analyze failure patterns, and drive root cause analysis.
Working closely with the monitoring plane and operations team to improve observability, tune auto-remediation workflows, and reduce mean time to recovery.
Engaging with product and customer service teams to understand real-world use cases — from IoT and mobile applications to large-scale analytics — and translating those needs into platform capabilities.
Mentoring junior engineers through code reviews, design feedback, and pairing sessions, helping them grow their technical skills and Amazon engineering judgment.
Contributing to the team's operational excellence by participating in on-call rotations and driving improvements to system reliability and operational tooling.
About the teamThe In-Memory Storage Platform team is a passionate group of engineers who thrive on solving hard distributed systems problems. We are a collaborative, intellectually curious team that values technical depth, ownership, and a bias for action. Our charter is Amazon Elasti Cache — an AWS service…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).