Site Reliability Developer
Listed on 2026-01-02
-
IT/Tech
Cloud Computing, Systems Engineer
United States
Redwood City, CA, United States
Austin, TX, United States
- Job Identification 320892
- Job Category Product Development
- Posting Date 12/19/2025, 04:53 PM
- Job Type Regular Employee
- Does this position require a security clearance? No
- Years See Job Description
- Additional Info Visa / work permit sponsorship is not available for this position
- Applicants are required to read, write, and speak the following languages English
Executive
Summary:
SPRE Architect Role Requirements
Oracle is seeking a Strategic Platform Reliability Engineering (SPRE) Architect to strengthen the architectural foundation and operational resilience of key SaaS offerings, ensuring availability, security, and compliance for top-tier customers. The SPRE Architect will lead cross-functional collaboration with SaaS and OCI teams, applying best practices and commercial blueprints to deliver highly available, future-ready cloud services.
Key responsibilities include safeguarding service uptime, driving automation, enhancing monitoring, and responding to critical incidents in a 24x7 environment. The role demands strong leadership, technical acumen across the full technology stack, and proven experience in large-scale service operations, with a focus on proactive system hardening and continuous improvement. Communication at all levels—including C-suite engagement—is essential, along with stakeholder management and mentoring junior team members.
Candidates must demonstrate expertise in compliance, Linux systems, cloud networking, programming/scripting languages, and Dev Ops tools. Experience supporting secure cloud environments and customer-facing web services at scale is required, along with a strong customer service orientation and the ability to thrive in high-pressure, evolving environments.
In summary:
This is a senior and highly technical leadership role, accountable for the design, resilience, compliance, and operational excellence of Oracle’s SaaS services for its most strategic customers.
Top Skills for the SPRE Architect Role:
- Technical Leadership & Stakeholder Management
Proven ability to lead teams, drive cross-functional collaboration, mentor junior members, and engage with stakeholders at all organizational levels, including the C-suite. - Cloud Architecture & Operational Excellence
Deep expertise in designing, implementing, and maintaining large-scale, highly available, and secure cloud services. - Site Reliability Engineering (SRE) Principles
Strong background in SRE best practices, including automation, monitoring, incident response, and continuous improvement. - Incident Management & Crisis Response
Expertise in monitoring, diagnosing, resolving, and communicating about critical service incidents in a 24x7 environment. - Compliance & Security Fundamentals
Knowledge of compliance standards relevant to enterprise cloud software and experience securing cloud infrastructure. - Analytical & Problem-Solving Skills
Strong analytical abilities to troubleshoot complex systems, identify root causes, and develop resilient solutions. - Communication & Customer Service Orientation
Excellent verbal and written communication skills, with the ability to clearly convey technical and business information to diverse audiences and ensure customer satisfaction during high-pressure situations. - Experience with Large-Scale, Customer-Facing Web Services
Demonstrated experience operating and scaling major web-based services for enterprise customers. - Automation & Dev Ops Tools
Familiarity with automation frameworks and Dev Ops tools (e.g., JIRA, Confluence) to streamline operations and maintain high availability. - Programming & Scripting
Proficiency in one or more scripting/programming languages (e.g., Python, Bash, Powershell, Java, Ruby) to automate operational tasks and tooling.
- Design, develop, and maintain large-scale, highly available, and secure cloud services.
- Lead cross-team collaboration for service resiliency, compliance, and operational excellence.
- Safeguard service uptime by monitoring, automating, and responding to incidents around the clock.
- Apply site reliability engineering best practices for automation, monitoring, and continuous…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).