Principal, Software Engineer – Conversational AI
Bentonville, Benton County, Arkansas, 72712, USA
Listed on 2025-12-07
-
Software Development
Software Engineer, AI Engineer
Overview
Position Summary...
What you'll do...
Cortex Team is Walmart's core AI conversational platform, powering the vision of delivering the world's best personal assistants to Walmart's customers, accessible via natural voice commands, text messages, rich UI interactions, and a mix of all of the above via multi-modal experiences. We believe conversations are a natural and powerful user interface for interacting with technology and enable a richer customer experience both online and in-store.
We are building and designing the next generation of Natural Language Understanding (NLU) services that other teams can easily integrate and leverage, and build rich experiences: from pure voice and text shopping assistants (Siri, Sparky), to customer care channels, to mobile apps with rich, intertwined, multi-modal interaction modes (Me@Walmart). Interested in diving in? We need solid engineers with the talent and expertise required to design, build, improve and evolve our capabilities in at least some of the following areas:
- Service oriented architecture in charge of exposing our NLU capabilities at scale, and enabling increasingly sophisticated model orchestration.
- Design and build primitives to efficiently orchestrate model-serving microservices, considering dependencies and improving combined latency and robustness.
- Implement functionality to drive improved machine learning modeling and experimental design (e.g., A/B testing).
- Model serving and operations, balancing model improvements with serving latency and cost, and guiding tradeoffs in architecture, tooling, and infrastructure.
- Load testing to clearly identify tradeoffs and tune the model-serving stack.
- Possibility to work on prompt engineering and agentic systems.
- Tooling, infrastructure and pipelines for reproducible workflows and models, enabling rapid innovation across the product lifecycle.
- Build and maintain pipelines to safely build and deploy models to production via continuous deployment, and provide robust diagnostics for quality control.
- Integrate or build labeling tools connected to data stores (GCP, Big Query) and coordinate multiple labeling sources.
- Contribute to demos, proofs of concepts, white papers, blogs.
Note that this is not a fully remote job; you are required to come to the office (currently at least 2 days a week).
Responsibilities- Drive principled and scientific load-testing to identify tradeoffs and tune the model-serving stack.
- Design, build and operate scalable NLU services for Walmart's large customer base (roughly 80% of American households).
- Develop primitives to orchestrate model-serving microservices and manage dependencies to reduce latency.
- Collaborate on architecture decisions, tooling, and infrastructure (CPU/GPU, cloud) for model serving based on current research and product needs.
- Support experimentation and ML model development through A/B testing and rigorous evaluation.
- Build and maintain pipelines for production deployment and continuous delivery.
- Develop diagnostics and monitoring to ensure quality and reliability in production systems.
- Work with labeling tools and data stores to support data quality and annotation workflows.
- Contribute to demos, proofs of concepts, and technical writing as part of the emerging tech group.
- 8+ years of experience in software engineering or related area.
- Solid data skills, strong computer-science fundamentals, and programming experience.
- Deep hands-on expertise in full-stack development.
- Programming experience with at least one modern language with an efficient runtime, such as Scala, Java, C++, or C#.
- Experience with at least one relational database technology such as MySQL, Postgre
SQL, Oracle, or MS SQL. - Some fluency in Python.
- Understanding of distributed data processing at scale.
- Ability to work on ambiguous problems and think abstractly.
- Ability to take a project from scoping requirements through launch.
- A continuous drive to improve, automate, and optimize systems and tools.
- Capacity to apply scientific analysis and mathematical modeling to predict and evaluate design outcomes.
- Excellent oral and written communication skills.
- Bachelor's degree or certification in Computer Science, Engineering, Mathematics, or related field.
- Large-scale distributed systems experience, including scalability and fault tolerance.
- Experience leading the development of complex data-driven software systems delivered to customers.
- Strong focus on scalability, latency, performance, robustness, and cost in cloud environments.
- Experience with cloud infrastructure (Open Stack, Azure, GCP, AWS) and infrastructure tools (Docker, Kubernetes).
- Experience building/operating highly available data extraction, ingestion, and massively parallel processing systems for large datasets.
- Hands-on expertise across technologies from front-end to back-end.
- Familiarity with Machine Learning concepts and processes.
- Masters or PhD in Computer Science, Physics, Engineering, Mathematics, or related field.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).