×
Register Here to Apply for Jobs or Post Jobs. X

Sr. SRE Engineer AI |LLM

Job in Bellevue, King County, Washington, 98009, USA
Listing for: Net2Source (N2S)
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Position: Sr. SRE Engineer -(Open AI |LLM)
  • Design, deploy, and operate enterprise AI Gateway infrastructure supporting OpenAI and internal LLM-based services.
  • Implement and manage regional routing (east/west), failover strategies, and upstream host configurations for AI traffic.
  • Develop and maintain Helm charts, Kubernetes manifests, and Jinja templates for multi-environment deployments (dev, plab, qlab).
  • Enable per-API configuration for rate limiting, AI feature toggles, security credentials, and regional host overrides.
  • Stay current with industry best practices for:
    • AI Gateways and MCP servers
    • Secure LLM consumption patterns
    • Token handling, secrets management, and request isolation
    • Observability standards for AI platforms
  • Lead bi-weekly technical and operational syncs with AI Gateway vendors.
  • Translate vendor capabilities, limitations, and roadmaps into actionable platform strategies.
  • Communicate clearly in both technical and business terms with:
    • Engineering teams
    • SRE
    • Security & compliance
    • Product and leadership stakeholders
Reliability, Observability & Operations
  • Build and maintain monitoring and troubleshooting frameworks for AI workloads using Splunk and Grafana.
  • Author and evolve SRE support cookbooks for proactive monitoring, incident response, and escalation.
  • Analyze failure rates, latency spikes, and request flows across distributed AI systems.
  • Support on-call readiness through actionable dashboards, alerts, and operational runbooks.
CI/CD & Automation
  • Build CI pipelines to generate and deploy environment-specific configurations at scale.
  • Automate service registration, deployment validation, and environment promotion.
  • Enforce consistent naming, versioning, and deployment standards across clusters and environments.
Cross-Functional Collaboration
  • Act as a technical bridge between application teams, SRE, security, and platform engineering.
  • Provide architectural guidance for teams onboarding to AI Gateway and Enterprise GPT platforms.
  • Contribute to platform roadmaps, technical design reviews, and operational readiness planning.
Required Qualifications
  • Strong experience with Kubernetes, Helm, and cloud-native networking.
  • Hands-on experience with Istio / service mesh, routing rules, and traffic management.
  • Proficiency in Python, Bash, and Jinja templating for infrastructure automation.
  • Experience operating production-grade APIs with high reliability and observability standards.
  • Deep understanding of SRE principles, monitoring, alerting, and incident management.
  • Experience building observability frameworks using Splunk, Grafana, or similar tools.
  • Strong ability to communicate complex technical issues in clear business terms.
  • Experience working with AI/LLM APIs (OpenAI or similar) in an enterprise context.
Preferred Qualifications
  • Knowledge of MCP servers, AI gateway patterns, and LLM security models.
  • Familiarity with security controls for AI platforms (secrets management, token handling, access controls).
  • Experience supporting multi-region, multi-environment deployments at scale.
  • Strong documentation skills with a focus on operational clarity and enablement.
  • CICD, Helm, IT infrastructure, Jinja, Kubernetes, computer science, containerization, continuous integration, information technology, software components, software development, software engineering, templating engines, virtualization
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary