×
Register Here to Apply for Jobs or Post Jobs. X

Machine Learning Engineer III​/Senior Machine Learning Engineer - AI Platform

Job in Atlanta, Fulton County, Georgia, 30383, USA
Listing for: Workday
Full Time position
Listed on 2026-06-05
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Machine Learning Engineer III / Senior Machine Learning Engineer - AI Platform

Workday’s mission is to make hard work pay off. As a Fortune 500 company and a leading AI platform for managing people, money, and agents, we shape the future of work so teams can reach their potential and focus on what matters most. You’ll feel our culture of integrity, empathy, and shared enthusiasm the moment you join, as we tackle big challenges with bold ideas and genuine care.

About

the Team

The AI Platform organization builds advanced AI solutions that power the core Workday software by modeling user behavior and providing intelligent automation. We create features and solutions used by millions of end‑users, making work easier and balanced for Workday’s global customer base.

About the Role

As a Machine Learning Engineer, you will help design and build our Agent Platform—the core infrastructure that enables teams to develop, deploy, orchestrate, and operate AI agents in production. The focus is on building systems and tooling to host and scale agent‑based applications powered by large language models (LLMs). You will partner closely with applied AI, product, and infrastructure teams to define how agents are built and operated across the organization.

Responsibilities
  • Design and build the core platform capabilities required to develop, host, and operate AI agents at scale.
  • Develop infrastructure and services for agent execution, orchestration, state management, and runtime reliability.
  • Build reusable abstractions, frameworks, and workflows in Python to support agent development patterns across teams.
  • Design and implement systems for tool use, memory, retrieval, workflow coordination, and human‑in‑the‑loop interactions.
  • Build and maintain services deployed on Kubernetes, focusing on scalability, resiliency, and operational excellence.
  • Develop capabilities for evaluation, tracing, observability, debugging, and performance monitoring of agent behavior in production.
  • Improve platform performance across latency, throughput, fault tolerance, and cost efficiency.
  • Create internal APIs, SDKs, and developer tooling that make it easier for engineering teams to build on the platform.
  • Partner with cross‑functional teams to product ionize new agent use cases and establish common platform patterns and best practices.
  • Contribute to technical architecture and help define the roadmap for agent infrastructure and platform evolution.
Basic Qualifications (MLE III)
  • 3+ years experience as part of a data science, machine learning software development team or a PhD/equivalent program.
  • 5+ years experience in Python and building reliable, maintainable production services.
  • 3+ years experience with distributed systems, APIs, asynchronous workflows, and service‑oriented architecture.
  • 3+ years experience designing systems with a focus on scalability, reliability, observability, and maintainability.
Basic Qualifications (Sr. MLE)
  • 6+ years of software engineering experience, including building and operating production‑grade backend, ML, or platform systems.
  • 8+ years experience in Python and building reliable, maintainable production services.
  • 5+ years experience with distributed systems, APIs, asynchronous workflows, and service‑oriented architecture.
  • 5+ years experience designing systems with a focus on scalability, reliability, observability, and maintainability.
Preferred Qualifications
  • Experience building or supporting agent platforms, AI infrastructure, or internal developer platforms.
  • Experience building and deploying machine learning or LLM‑powered applications in production.
  • Familiarity with LLM application patterns, including:
    • Tool calling
    • Retrieval‑augmented generation (RAG)
    • Memory and context management
    • Multi‑step workflows and orchestration
    • Human‑in‑the‑loop systems
  • Experience designing and implementing evaluation frameworks for LLM or agent quality.
  • Familiarity with vector databases, model serving, prompt/version management, and experimentation tooling.
  • Solid knowledge of Data Science principles and their application in NLP.
  • Experience running services in Kubernetes‑based environments.
  • Ability to work across ambiguity, make strong technical tradeoffs, and drive projects from concept to production.
  • Strong communication and collaboration…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary