Systems Analyst Job Austin area,Texas USA,IT/Tech

Position: Systems Analyst 3 529601671

Hybrid - On Site and Telework (Must live in Austin TX area) Responsibilities

Site Reliability Engineer will be responsible for ensuring the reliability, availability, performance, and scalability of production systems by applying software engineering practices to infrastructure and operations. Partners with development teams to build resilient, observable, and automated platforms that meet defined service level objectives (SLOs).

Plan and accomplish goals relying on experience and judgment. Independently perform a variety of complicated tasks, demonstrating creativity and latitude. Understands business objectives and problems, identifies alternative solutions, performs studies and cost/benefit analysis of alternatives. Analyzes user requirements, procedures and problems to automate processing or improve existing computer system. Confers with personnel of organizational units involved to analyze current operational procedures, identify problems, and learn specific input and output requirements such as forms of data input, how data is to be summarized, and formats for reports.

Writes detailed description of user needs, program functions, and steps required to develop or modify computer program. Reviews computer system capabilities, specifications, and scheduling limitations to determine if requested program or program change is possible within existing system.

Minimum Requirements (Required)

8+ years experience in systems engineering, Dev Ops, or site reliability engineering roles
Strong experience with Linux/Unix systems and system internals
Proficiency in one or more programming/scripting languages (Python, Go, Java, Bash)
Experience designing and operating highly available, distributed systems
Strong knowledge of cloud platforms (AWS, or GCP) and cloud-native services
Experience with containerization and orchestration (Docker, Kubernetes)
Strong understanding of monitoring, alerting, and logging concepts
Experience defining and managing SLIs, SLOs, and error budgets
Familiarity with incident management, root cause analysis (RCA), and postmortems
Experience integrating security and compliance into operational workflows

Preferred Qualifications

Familiarity with observability tools (Prometheus, Grafana, Application Insights, Datadog, Splunk)
Experience operating 24x7 production environments with on-call rotations
Experience with chaos engineering and resiliency testing
Experience with feature flags, canary deployments, and progressive delivery
Strong documentation skills for runbooks, dashboards, and operational standards

Employment Terms

Expected

Start Date:

05/01/2026. Expected End Date: 08/31/2026. May be renewed up to 3 years.

Location and Work Schedule

Hybrid - On-site and Telework. Program allows candidates local to the Austin area (within 50‑mile radius). 3 days remote, 2 days (Mondays, Thursdays) on-site in Austin, TX 78751.

Normal business hours are Monday through Friday from 8:00 AM to 5:00 PM, excluding state holidays. Worker may be required to work outside normal business hours on weekends, evenings and holidays as requested.

#J-18808-Ljbffr