×
Register Here to Apply for Jobs or Post Jobs. X

Operations Analyst

Job in Menlo Park, San Mateo County, California, 94025, USA
Listing for: Hippocratic AI
Full Time position
Listed on 2026-06-27
Job specializations:
  • IT/Tech
    IT Support, SRE/Site Reliability, Cybersecurity, Cloud Computing: Infrastructure & Operations
Job Description & How to Apply Below

Operations Analyst

We are seeking a highly reliable and detail-oriented Operations Analyst to ensure the continuous, 24×7 operation of Hippocratic AI's production systems, integrations, and customer/partner environments. This role is critical to minimizing customer and partner downtime, maintaining trust, and ensuring our AI agents and supporting systems operate smoothly at all times.

As an Operations Analyst, you will be responsible for monitoring system alerts, integrations, and operational reports; performing proactive maintenance; resolving common operational issues; and triaging advanced issues to the appropriate engineering, platform, or partner teams. You will play a central role in detecting issues early, coordinating incident response, and maintaining operational excellence across all customer and partner deployments.

You will work closely with engineering, infrastructure, security, customer support, and partner teams, and will help build the operational tooling, reporting, and automation needed to scale Hippocratic AI safely and reliably.

This role is expected to be in our Palo Alto office five days a week, unless otherwise specified.

What You'll Do
  • Integration Management & Development

  • Monitor all production systems, integrations, and automated alerts to ensure 24×7 continuous operations across customers and partners.

  • Serve as a first-line responder for operational alerts, diagnosing and resolving standard issues within defined SLAs.

  • Triage complex or advanced issues and page/engage the appropriate on-call engineers, platform teams, or partner contacts.

  • Coordinate incident response activities, track progress to resolution, and ensure clear internal handoffs during escalations.

  • Validate system recovery and perform post-incident checks to ensure full service restoration.

  • Proactive Maintenance & Reliability

  • Perform proactive system health checks, integration validations, and routine maintenance to prevent outages and degradation.

  • Identify trends in alerts, incidents, and performance metrics to recommend preventative actions and long-term fixes.

  • Help define and refine operational runbooks, escalation paths, and standard operating procedures (SOPs).

  • Participate in on-call rotations and support after-hours and weekend coverage as needed to maintain 24×7 availability.

  • Reporting, Automation & Tooling

  • Create and maintain operational reports and dashboards for internal teams, customers, and partners.

  • Build and maintain scripts and automation to monitor system health, validate integrations, and generate customer- or partner-specific reports.

  • Customize operational reporting for each customer/partner to meet contractual, SLA, and compliance requirements.

  • Continuously improve monitoring, alerting, and observability tooling to reduce noise and increase signal quality.

  • Cross-Functional Collaboration

  • Work closely with engineering, infrastructure, security, and customer support teams to resolve incidents and improve system resilience.

  • Support customer-facing teams by providing operational insights, incident summaries, and root-cause analysis.

  • Assist with onboarding new customers and partners by validating integrations, monitoring readiness, and ensuring operational coverage.

  • Contribute to post-incident reviews and continuous improvement initiatives to strengthen overall platform reliability.

What You Bring
  • Bachelor's degree in Computer Science, Health Informatics, Information Systems, or a related field.

  • Bachelor's degree in Information Systems, Computer Science, Operations, Engineering, or a related field (or equivalent practical experience).

  • 3+ years of experience in operations, site reliability, NOC, technical support, or production monitoring roles.

  • Hands-on experience monitoring production systems, integrations, APIs, or data pipelines in a 24×7 environment.

  • Familiarity with alerting and monitoring tools (e.g., Datadog, New Relic, Cloud Watch, Prometheus, Grafana, Pager Duty, Opsgenie, or similar).

  • Ability to troubleshoot common system, integration, and data-flow issues using logs, metrics, and dashboards.

  • Experience writing scripts or automation using tools/languages such as Python, Bash, SQL, or similar.

  • Strong…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary