×
Register Here to Apply for Jobs or Post Jobs. X

Senior AI Engineer; Full-Stack

Job in Seattle, King County, Washington, 98127, USA
Listing for: 9series
Full Time position
Listed on 2026-06-03
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 120000 - 150000 USD Yearly USD 120000.00 150000.00 YEAR
Job Description & How to Apply Below
Position: Senior AI Engineer (Full-Stack)

We are hiring a Senior AI Engineer who builds production-grade AI products end-to-end. You will design and ship AI agents, Retrieval-Augmented Generation (RAG) systems, and fine-tuned small language models, while also owning the full-stack delivery from React/Vue/Angular frontends through Python/Node backends to AWS, GCP and Azure deployments.

Equally important: you are an AI-adopted engineer. You use Claude Code, Cursor, Codex, and other AI coding assistants as a daily multiplier, and you know how to use them well — managing context, controlling token spend, writing CLAUDE.md / AGENTS.md files, using subagents and MCP servers, and applying evaluation-driven workflows so that AI-generated code is shipped responsibly.

What You Will Do
  • Design, build and deploy AI agents using Lang Chain, Lang Graph, Llama Index, CrewAI or equivalent frameworks — including multi-agent orchestration, tool use, memory, and planning loops.
  • Architect RAG pipelines end-to-end: ingestion, chunking, embedding selection, vector stores (Pinecone / Weaviate / Qdrant / pgvector), hybrid search, re-ranking, query rewriting, and evaluation.
  • Fine-tune small and open-source language models (Llama, Mistral, Phi, Gemma, Qwen) using LoRA, QLoRA, PEFT, instruction tuning and DPO — and decide when fine-tuning is the right answer versus prompting or RAG.
  • Build full-stack AI applications:
    React/Next.js frontends with streaming UIs (Vercel AI SDK / SSE / Web Sockets), FastAPI or Node backends, and well-designed APIs.
  • Own deployment, scaling and observability on AWS (Bedrock, Sage Maker, Lambda, ECS/EKS) and GCP (Vertex AI, Cloud Run, GKE), with Docker, Kubernetes, Terraform and CI/CD.
  • Implement LLM observability and evals using Lang Smith, Langfuse, RAGAS, Deep Eval — and treat evaluation as a first-class engineering artifact, not an afterthought.
  • Apply AI coding assistants (Claude Code, Cursor, Codex, Windsurf, Copilot) as a daily tool with strong discipline around context management, token efficiency, subagents, hooks, slash commands, and MCP servers.
  • Address non-functional requirements: latency budgets, cost/token economics, prompt injection defense, PII handling, OWASP LLM Top 10, rate limiting, semantic caching, and graceful degradation.
  • Collaborate with product, design and business stakeholders to translate ambiguous problems into shippable AI solutions, and mentor mid-level engineers on AI engineering practices.
Must-Have Skills
  • 4+ years of software engineering and at least 2 years of hands-on production work with LLMs (OpenAI, Anthropic Claude, Gemini, or open-source).
  • Strong RAG experience: chunking strategies, embedding models, vector databases, hybrid search, re-ranking, evaluation, and avoiding common failure modes.
  • Production experience building AI agents with Lang Chain and Lang Graph (or Llama Index, CrewAI, Auto Gen, Pydantic AI). Comfortable with tool/function calling, structured outputs, agent memory and multi-agent patterns.
  • Experience fine-tuning small/open-source models (LoRA, QLoRA, PEFT) and using Hugging Face Transformers, Datasets, Accelerate, and the Hub.
  • Strong prompt engineering: system design, few-shot, chain-of-thought, prompt caching, structured output schemas, evaluation of prompts as code.
AI-Augmented Development
  • Daily, production-grade use of Claude Code, Cursor, or Codex. Understands CLAUDE.md / AGENTS.md, project memory files, slash commands, subagents, hooks, MCP servers, and plan-vs-execute workflows.
  • Deliberate token and context management: knows when to use Haiku vs Sonnet vs Opus (and equivalents on other providers), uses prompt caching, batches work, prunes context aggressively.
  • Disciplined review of AI-generated code, with tests and evals — never ships unread output.
  • Backend:
    Python (FastAPI / Flask) and/or Node.js (Type Script). Solid grasp of async patterns, streaming responses (SSE / Web Sockets/ API).
  • Frontend:
    React, Next.js, Type Script, Tailwind CSS. Comfortable building streaming chat UIs and agentic interfaces.
  • Databases:
    Postgre

    SQL, Redis, at least one vector DB. Familiar with schema design, indexing, and query optimization.
Non-Functional Engineering
  • Latency: streaming, parallel tool calls, model…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary