Sr AI Engineer, Context Engineering
Listed on 2026-05-08
-
Software Development
AI Engineer (Applied/Software)
Position Summary
As a Sr. Staff AI Platform Engineer, you are first and foremost a Systems Architect. Your mission is to design and build the high-performance software foundation that powers the enterprise. While your core expertise lies in distributed systems, cloud-native architecture, and platform engineering, you will apply these skills specifically to the "Context Layer" – the specialized infrastructure required to fuel next-generation Agentic AI workflows.
You will operate at the intersection of Systems Programming and Modern AI Infrastructure, solving "hard-tech" problems like real-time data orchestration, automated metadata evolution, and multi-cloud compute optimization. This is a "platform-as-a-product" role; you build the tools, SDKs, and engines that enable hundreds of other engineers to build autonomous agents with ease.
Key Responsibilities- AI Platform Strategy & Context Retrieval: Define and own the 3-5 year technical roadmap for our high-scale, AI-ready Data Lakehouse. This platform must be explicitly optimized for AI Agent operations and efficient context retrieval, delivering low-latency, high-throughput data access essential for vector databases and LLM-driven applications.
- Systems & Agentic R&D: Prototype and benchmark emerging trends in the AI ecosystem. Evaluate next-generation architectural patterns such as Multi-Agent Orchestration, autonomous long-term memory management, and specialized Agent Evaluation frameworks to ensure the platform remains at the cutting edge.
- Engineering Excellence: Set the gold standard for code quality, CI/CD, and system design across the organization. Lead cross-functional architecture reviews and serve as the final escalation point for the most complex technical bottlenecks.
- Agentic Ecosystem Enablement: Design the platform-level interfaces required for Agentic workflows, focusing on standardized "Host-to-Server" communication and tool-execution environments. Building robust Human-in-the-Loop triggers and fail-safe mechanisms for autonomous actions.
- Contextual Infrastructure: Build the "Context Fabric" that allows AI agents to securely discover, access, and interpret enterprise data. Architect systems that move beyond basic search into Reasoning-based Retrieval.
- Protocol & Trend Standardization: Implement and advocate for emerging standards like the Model Context Protocol (MCP) and stay ahead of trends such as Small Language Models for edge-compute and Agentic RAG.
- Expert Software Engineering: 15+ years in software engineering, expert in Java or Scala (distributed systems focus) and Python.
- Systems Architecture: Deep experience building extensible frameworks, high-throughput APIs, and libraries used by other developers. Prioritize building software-defined infrastructure over manual configuration.
- Agentic Design Patterns: Hands-on experience with the latest trends in agent development, such as Multi-Agent Orchestration (using frameworks like Lang Graph or CrewAI) and the transition from static RAG to Agentic RAG.
- Protocol Interoperability: Knowledge of the Model Context Protocol (MCP) and other emerging standards that allow AI agents to interact with diverse data sources and tools in a plug-and-play manner.
- AI-Ops Integration: Experience building AI-native CI/CD features, such as automated LLM-based evaluations and automated root-cause analysis for system failures.
- Human-in-the-Loop (HITL): Understanding of building automated workflows that pause agent actions for human approval, ensuring safety and governance for autonomous systems.
- Git Ops & Continuous Delivery: Expert-level experience with Git Ops workflows (e.g., ArgoCD or Flux) to ensure that all platform configurations— including AI prompt templates and model parameters—are versioned, audited, and automatically reconciled.
- Infrastructure-as-Code (IaC) at Scale: Mastery of Terraform. Build modular, reusable libraries that enforce organizational security and cost-efficiency standards across hundreds of cloud accounts.
- Modern CI Pipelines: Proficiency in designing complex pipelines (e.g., Git Hub Actions, Git Lab CI) that integrate automated testing, security scanning, and deployment gates for high-availability systems.
- Unified Observability: Experience with Open Telemetry (OTel) to build deep visibility into distributed systems and track system performance and business-centric AI metrics.
- Cloud Console & Service Mastery: Deep proficiency navigating and configuring AWS and Azure Management Consoles. Understand how to architect, secure, and optimize core services (IAM, EC2/VMs, S3/Blob, and specialized AI/ML services) natively within both ecosystems.
- Cloud-Agnostic Abstraction: Build platform layers that bridge AWS and Azure for seamless deployment and management across a multi-cloud…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).