×
Register Here to Apply for Jobs or Post Jobs. X

Member of Technical Staff, AI Reliability & Monitoring Engineering Lead San Francisco, Californ

Job in Cupertino, Santa Clara County, California, 95014, USA
Listing for: Postman
Part Time position
Listed on 2025-12-01
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Job Description & How to Apply Below
Position: Member of Technical Staff, AI Reliability & Monitoring Engineering Lead San Francisco, Californ[...]

Member of Technical Staff, AI Reliability & Monitoring Engineering Lead Who Are We?

Postman is the world’s leading API platform, used by more than 40 million developers and 500,000 organizations, including 98% of the Fortune 500. Postman is helping developers and professionals across the globe build the API-first world by simplifying each step of the API lifecycle and streamlining collaboration—enabling users to create better APIs, faster.

The company is headquartered in San Francisco and has offices in Boston, New York, and Bangalore - where Postman was founded. Postman is privately held, with funding from Battery Ventures, BOND, Coatue, CRV, Insight Partners, and Nexus Venture Partners. Learn more at  or connect with Postman on X via @get postman.

Postman is seeking an experienced AI Systems Reliability Engineer to help define, build, and maintain the infrastructure and processes that ensure the reliability, scalability, and performance of Postman’s AI-powered API and agentic systems in production. This role focuses on monitoring, availability, incident response, and automation to support AI services and tools trusted by millions of developers globally.

What You’ll Do
  • Develop and manage reliability metrics (SLOs) for AI-driven API services and agentic AI platform features
  • Implement comprehensive observability and monitoring systems for real-time performance and fault detection
  • Design and drive automated failover, recovery, and incident response strategies for high-availability AI infrastructure
  • Collaborate closely with engineering, platform, and product teams to align reliability efforts with broader organizational goals
  • Lead efforts to build internal tooling and automation focused on AI system stability and operational excellence
  • Drive continuous improvement in deployment practices, monitoring approaches, and incident management processes
About You
  • Have a strong background in AI reliability engineering, SRE, or Dev Ops for distributed systems
  • Understand the unique challenges of maintaining large-scale AI systems and integrating AI-specific metrics into reliability frameworks
  • Are experienced with cloud platforms, monitoring tools, and incident response automation
  • Are comfortable collaborating across teams to influence best practices for AI system reliability and operational health
  • Thrive in dynamic, fast-paced environments focusing on delivering reliable, safe AI-powered services
Bonus Skills and Experiences
  • Hands-on experience with AI/ML infrastructure, including GPU/xPU optimization and scaling
  • Familiarity with API platform operations and large-scale distributed services
  • Prior experience building or operating observability tools tailored for AI and agentic systems
  • Contribution to open-source projects or reliability engineering thought leadership
What Else?

In addition to Postman's pay-on-performance philosophy, and a flexible schedule working with a fun, collaborative team, Postman offers a comprehensive set of benefits, including full medical coverage, flexible PTO, wellness reimbursement, and a monthly lunch stipend. Our wellness programs help you stay in the best of your physical and mental health. Our frequent team-building events keep you connected, while our donation-matching program can support the causes you care about.

We’re building a long-term company with an inclusive culture where everyone can be the best version of themselves.

At Postman, we embrace a hybrid work model. For all roles based out of San Francisco Bay Area, Boston, Bangalore, Hyderabad, and New York, employees are expected to come into the office 3-days a week. We balance flexibility and collaboration, grounded in feedback from our workforce, leadership team, and peers. The benefits of our hybrid office model include knowledge sharing, brainstorming, and in-person collaboration that cannot be replicated via video conferencing.

Our

Values

At Postman, we create with the same curiosity that we see in our users. We value transparency and honest communication about not only successes, but also failures. In our work, we focus on specific goals that add up to a larger vision. Our inclusive work culture ensures that everyone is valued equally as important pieces of our final product. We are dedicated to delivering the best products we can.

Equal

opportunity

Postman is an Equal Employment Opportunity and Affirmative Action Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, protected veteran status, or disability status. Postman does not accept unsolicited headhunter and agency resumes. Postman will not pay fees to any third-party agency or company that does not have a signed agreement with Postman.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary