Backend Engineer,AI Systems Job Palo Alto area,California USA,Software Development

About the Company

A1 is building a proactive AI chat app for everyday users to bring intelligence to conversations, errands, organising and workflows. Unlike traditional chat-based applications, our product focuses on achieving high reliability for long-running workflows, persistent context, and real-world task completion. The system must handle multi-step reasoning, interact with external tools, and remain reliable despite non-deterministic model behavior.

Role Overview

As a Backend Engineer, AI, you own the inference and orchestration layer that powers every AI interaction in the product. Your work sits between models and users, where latency, correctness, reliability, and cost directly impact real-world experience. Build and operate production systems that turn model capability into fast, stable, observable APIs used across mobile and desktop clients.

Focus

Build and operate backend systems that serve AI-powered features in production.
Design inference pipelines and orchestration layers that handle multi-step workflows, tool calls, and retries.
Manage the full lifecycle of AI requests: routing, caching, batching, streaming, and state management.
Optimize latency, throughput, and cost across model inference and downstream systems.
Design systems that remain reliable despite non-deterministic model behavior and external dependencies.
Implement observability for AI systems, including logging, tracing, and debugging of model outputs and failures.
Collaborate with ML and product teams to translate model capabilities into stable, production-grade APIs.

Ideal Experiences

Strong backend engineering fundamentals in production environments.
Experience running high-throughput, low-latency services.
Familiarity with AI inference patterns (LLMs, embeddings, multimodal).
Comfortable debugging distributed systems under load.
Bias toward shipping and learning from production behavior.

Outcomes

Backend systems run reliably at scale, handling production AI traffic with low latency and high throughput.
Multi-step AI workflows complete successfully across tools and services, with robust handling of failures and retries.
APIs are stable, clear, and support seamless integration with frontend and ML systems.
Production incidents are quickly detected, diagnosed, and resolved, minimizing user impact.
Iterative improvements based on real usage continuously increase system performance and reliability.
System design evolves to support increasing scale, complexity, and new AI capabilities without major rewrites.

Tech Stack

Python
Node Js
Pytorch
OpenAI / Anthropic / open-source LLMs
SQL & No

SQL
Kubernetes
Docker

#J-18808-Ljbffr

Backend Engineer, AI Systems