Senior AI Platform Engineer- Data and Systems
Listed on 2026-06-18
-
Software Development
AI Engineer (Applied/Software), Backend Developer, Cloud Engineer - Software, Machine Learning/ ML Engineer
The Opportunity
Adobe Express Data Platform is the intelligence backbone for millions of creators- a billion-event-per-day system spanning streaming, feature serving, agent data APIs, and a lakehouse that powers every personalization decision, experiment, and AI workflow. We are evolving it into a streaming-first, self-healing, agent-ready Lakehouse and we need engineers who challenge the status quo, move fast, and default to an agentic-first approach for every problem they encounter.
This is a systems-first engineering role. You won't build ML models, you'll build the foundational infrastructure that makes AI, analytics, and autonomous agents possible 'll bring the conviction that any manual, repetitive, or slow platform workflow is a candidate for agentic automation and the engineering skill to make that real.
We are tackling hard, consequential problems: collapsing multi-hour pipeline latency to real-time, building MCP-compatible agent data APIs so autonomous AI systems can query and reason over platform data, evolving our ML Attribute Store with low-latency online feature serving, and pioneering AI-powered data governance that replaces manual operational toil with self-healing pipelines. Our team's motto is simple: make the platform simpler, faster, and more reliable.
Shipping fast isn't reckless here - it's a discipline.
- Design and build streaming-first data pipelines that collapse end-to-end latency from hours to minutes, through event-driven architectures.
- Own and extend the ML Attribute Store - building low-latency online serving capabilities alongside batch feature computation with unified batch/streaming aggregation to prevent training-serving skew.
- Build MCP-compatible Agent Data APIs and tool servers that make the lakehouse discoverable and queryable by autonomous AI agents through standardized protocols, semantic layers, and catalog-driven data discovery.
- Develop agentic framework - automated anomaly detection, duplicate event cleanup, transient event lifecycle management with audit trails, pipeline self-healing, and root cause analysis automation.
- Drive operational excellence: observability, incident detection and response automation, performance tuning, cost optimization, and on-call ownership for mission-critical platform services.
- Collaborate across Data Science, Personalization, Engineering Operations, Product, and Experimentation teams to translate platform capabilities into self-serve infrastructure that reduces engineering toil for non-platform teams.
- Use and champion AI-powered developer tools (Claude Code, Cursor, Git Hub Copilot, or similar) to accelerate personal and team engineering velocity.
- 6+ years of experience in data platform engineering, distributed systems, or backend infrastructure at scale.
- Deep hands-on experience with Apache Spark, Databricks, Delta Lake, or equivalent lakehouse technologies (Iceberg, Hudi).
- Proven track record building and operating large-scale pipelines processing billions of events daily with sub-hour latency SLAs.
- Strong experience with streaming systems:
Kafka, Kinesis, Flink, Spark Structured Streaming, or Delta Live Tables. - Proficiency in Python and/or Scala; SQL fluency required. Java or Go is a plus.
- Experience with cloud platforms (AWS or Azure), containerization (Docker, Kubernetes), and CI/CD for data pipelines.
- Production experience integrating LLMs into engineering workflows - not prototypes, but systems running against real data with real users. Includes prompt engineering, tool-use/function-calling, structured output parsing, and context window management.
- Hands-on experience with agentic AI frameworks and multi-agent orchestration (Lang Chain, Lang Graph, CrewAI, Auto Gen, or custom agent loops with memory, planning, and tool routing).
- Understanding of MCP (Model Context Protocol) and/or A2A protocols for exposing platform capabilities as agent-consumable tool servers - or demonstrable ability to build equivalent agent-tool integration surfaces.
- Experience building or operating ML Feature Stores (online and/or offline), including…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).