Staff Engineer, Software Development Engineering; Apps
Job in
Milpitas, Santa Clara County, California, 95035, USA
Listed on 2026-06-01
Listing for:
Sandisk
Full Time
position Listed on 2026-06-01
Job specializations:
-
Software Development
AI Engineer, Cloud Engineer - Software, Software Engineer, DevOps
Job Description & How to Apply Below
Company Description
Sandisk understands how people and businesses consume data and we relentlessly innovate to deliver solutions that enable today's needs and tomorrow's next big ideas. With a rich history of groundbreaking innovations in Flash and advanced memory technologies, our solutions have become the beating heart of the digital world we're living in and that we have the power to shape.
Sandisk meets people and businesses at the intersection of their aspirations and the moment, enabling them to keep moving and pushing possibility forward. We do this through the balance of our powerhouse manufacturing capabilities and our industry-leading portfolio of products that are recognized globally for innovation, performance and quality.
Sandisk has two facilities recognized by the World Economic Forum as part of the Global Lighthouse Network for advanced 4IR innovations. These facilities were also recognized as Sustainability Lighthouses for breakthroughs in efficient operations. With our global reach, we ensure the global supply chain has access to the Flash memory it needs to keep our world moving forward.
Job Description
We are hiring a Staff Engineer to join our AI Platform team building and operating our enterprise AI platform for engineering workflows. This role is focused on the foundational platform layer - Agentic systems, MCP ecosystem, LLM gateway, memory and knowledge infrastructure, and platform observability - that powers AI applications across Flash Product's Group.
Essential Duties and Responsibilities:
- Platform Development: Design, build, and maintain core components of the Nexus platform, including agentic orchestration (Lang Graph / Deep Agents), MCP servers and gateway integrations, memory and knowledge systems, and the LLM gateway layer.
- Agentic Systems: Develop and extend multi-agent workflows, tool-calling pipelines, and conversational AI experiences across Nexus Builder, Nexus Chat, and purpose-built applications.
- MCP Ecosystem: Build OIDC-compliant MCP servers that expose enterprise data and tools (Jira, test management, internal systems) to agents in a secure, governed way.
- Reliability & Observability: Drive platform stability through environment separation, monitoring, tracing, evaluation pipelines, and incident response practices.
- Technical Leadership: Own technical decisions within assigned work streams, conduct design reviews, and mentor engineers on agentic patterns, LLM application design, and production AI best practices.
- Cross-Functional Collaboration: Partner with Info Sec, Cloud Infrastructure, IAM, and product engineering teams to ship platform capabilities that meet enterprise requirements.
- Continuous Innovation: Stay current with advances in LLMs, agentic frameworks, and AI infrastructure; evaluate and integrate new technologies into the Nexus roadmap.
Required:
- Master's degree in Artificial Intelligence, Machine Learning, Data Science, Computer Science, or a related field.
- Approximately 5 years of professional software engineering experience, with demonstrated impact building production AI/ML systems or developer platforms.
- Strong proficiency in Python; working knowledge of Type Script/JavaScript and React.
- Hands-on experience with modern AI/ML frameworks:
Lang Chain, Lang Graph, Llama Index, PyTorch or Tensor Flow, and the Hugging Face ecosystem. - Practical understanding of LLMs, transformers, embeddings, RAG architectures, and agentic design patterns.
- Experience integrating with LLM providers and gateways (Anthropic, OpenAI, Portkey, or equivalent), and familiarity with the Model Context Protocol (MCP).
- Solid grasp of distributed systems, microservices, REST/Graph
QL APIs, and event-driven architectures. - Experience deploying and operating workloads on Kubernetes with Docker, including hybrid on-prem / cloud topologies.
- Comfort with relational (Postgre
SQL), No
SQL (Mongo
DB, Elasticsearch), in-memory (Redis / Valkey), and vector databases. - Familiarity with OAuth / OIDC, SSO, and enterprise security and compliance practices.
- Strong written and verbal communication, with the ability to articulate technical tradeoffs to engineering and leadership audiences.
- Proven ability to lead initiatives end-to-end, work independently, and collaborate effectively across teams.
- Bias for action, ownership mindset, and a track record of shipping in fast-moving environments.
- Problem Solving: Demonstrated ability to decompose ambiguous problems, design pragmatic solutions, and debug complex issues across the AI/ML and infrastructure stack.
- Programming: Python, Type Script/JavaScript, React, GraphQL
- AI/ML & Agents: Lang Graph, Lang Chain, Llama Index, PyTorch/Tensor Flow, RAG, MCP, agentic frameworks
- Platform & Infra: Kubernetes, Docker, microservices, REST APIs, CI/CD
- Data: RDBMS (Postgre
SQL), No
SQL (Mongo
DB, Elasticsearch), in-memory (Redis/Valkey), vector databases - Security & Identity: OAuth, OIDC, SSO, enterprise auth patterns
Sandisk is committed to…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×