AI/ML Engineer - Agentic Job San Jose area,California USA,Software Development

Job Definition

The AI/ML Engineer – Agentic is a senior individual contributor responsible for designing, building, and operating a production‑grade agentic orchestration platform. The role focuses on enterprise‑scale LLM integration, shared retrieval and memory services, and high‑performance backend systems that power agent execution. The engineer will own reliability, observability, and cloud‑native operations for non‑deterministic agentic systems in production.

Responsibilities

Design, build, and own a production‑grade agentic orchestration platform, implementing scalable multi‑agent workflows using frameworks such as Lang Graph or equivalent.
Architect, develop, and operate the MCP server infrastructure, including inter‑agent communication, tool/server registries, domain isolation, versioning, and lifecycle management.
Integrate and operate LLM services at enterprise scale, supporting streaming, structured outputs, tool/function calling, and robust error handling across agent workflows.
Build and maintain retrieval and memory services for agentic systems, including RAG pipelines, Open Search‑backed vector stores, hybrid search, and relevance optimization.
Develop and operate high‑performance backend services (FastAPI, gRPC, async systems, messaging) that power orchestration, tool execution, and agent runtime behavior.
Own observability and reliability for non‑deterministic systems, delivering end‑to‑end tracing, monitoring, and cost/performance visibility for agent executions.
Manage cloud‑native infrastructure and deployment, including Kubernetes workloads, containerized services, CI/CD pipelines, and resource optimization (CPU/memory, autoscaling).

Education and Experience Required

Bachelor’s degree in computer science, engineering, information systems, or a closely related quantitative discipline. Master’s degree desirable.
Typically 4‑7 years of relevant experience.

Core Agentic/Orchestration Skills

Production experience with agentic frameworks such as Lang Graph (preferred), Claude Agent SDK, or equivalent (not just prototypes).
Deep understanding of multi‑agent architectures: supervisor/worker patterns, hierarchical agent graphs, ReAct loops, ReWoo.
Hands‑on with inter‑agent communication protocols: MCP (Model Context Protocol), A2A, tool registry / server registry.

LLM & ML Engineering

LLM API integration at scale: structured outputs, streaming, function/tool calling, error handling.
RAG pipeline design and optimization: chunking strategies, re‑ranking, hybrid search – know what knobs to turn for what issues.
Vector store experience:
Open Search or equivalent.
Applied ML intuition: fine‑tuning concepts, prompt engineering, evaluations, Qlora, PEFT.

Infrastructure & Production Systems

Backend development:
FastAPI, gRPC, Kafka, Redis, message queues, async system design – Python, API Design, Graph

QL and/or REST at enterprise scale.
Observability and monitoring for non‑deterministic systems:
Lang Fuse, Prometheus, or equivalent.
Kubernetes: deploying, scaling, and managing workloads (Deployments, Services, Config Maps, Secrets).
Container image management: building, tagging, versioning, and pushing images via Docker; familiarity with a container registry (ECR, GCR, Docker Hub).
CI/CD pipelines for automated build and deploy (Git Hub Actions, Jenkins, ArgoCD, or similar).
Resource management: CPU/memory limits, autoscaling (HPA/VPA), health probes.

Additional Preferred Skills

Multi‑tenant architecture awareness: rate limiting, auth, tenant isolation.
Knowledge base and cost optimization experience: AWS Bedrock, Open Search Serverless.

Benefits

Health & Wellbeing – comprehensive suite of benefits supporting physical, financial, and emotional wellbeing.
Personal & Professional Development – programs to help reach career goals, learn new skills, or transition to another division.
Unconditional Inclusion – inclusive work environment celebrating individual uniqueness and value of varied backgrounds.

Salary

Annual salary for this position in California ranges from USD 136,500 to 276,500, based on geographic location, work experience, education, and skill level. Variable incentives may also be offered.

EEO Statement

HPE is an Equal Employment Opportunity, Veterans, Disabled, LGBTQ employer. We do not discriminate on the basis of race, gender, or any other protected category, and all decisions are made on the basis of qualifications, merit, and business need. HPE will comply with all applicable laws related to employer use of arrest and conviction records, including laws requiring consideration of qualified applicants with criminal histories.

Recruitment

Fraud Alert

HPE and its authorized recruitment agencies will never charge a candidate a registration or hiring fee. Candidates should verify credentials through official channels and report any suspicious communication.

#J-18808-Ljbffr

AI​/ML Engineer - Agentic

AI/ML Engineer - Agentic