Search - Search Inference - Senior MLOps Engineer
Listed on 2025-12-25
-
IT/Tech
AI Engineer, Machine Learning/ ML Engineer, Data Engineer
Search - Search Inference - Senior MLOps Engineer
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale – unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50 % of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter.
By taking advantage of all structured and unstructured data – securing and protecting private information more effectively – Elastic’s complete, cloud‑based solutions for search, security, and observability help organizations deliver on the promise of AI.
The Role
The Search Inference team is responsible for bringing performant, ergonomic, and cost‑effective machine learning (ML) model inference to Search workflows. ML inference has become a crucial part of the modern search experience, whether used for query understanding, semantic search, RAG, or any other GenAI use‑case.
Our goal is to simplify ML inference in Search workflows by focusing on large‑scale inference capabilities for embeddings and reranking models that are available across the Elasticsearch user base. As a team, we are a collaborative, cross‑functional group with backgrounds in information retrieval, natural language processing, and distributed systems, working with Go microservices, Python, Ray Serve, Kubernetes/Kube Ray on AWS, GCP & Azure.
We provide thought leadership across a variety of mediums including open code repositories, publishing blogs, and speaking focus on matching the expectations of our customers along the lines of throughput, latency, and cost. We’re seeking an experienced ML Ops Engineer to help us deliver on this vision.
What You Will Be Doing- Work with the team (and other teams) to evolve our inference service so it may host LLMs in addition to existing models (ELSER, E5, Rerank).
- Enhance the scalability and reliability of the service and work with the team to ensure knowledge is shared and best practices are followed.
- Improve the cost and efficiency of the platform, making the best use of available infrastructure.
- Adapt existing solutions to use our inference service, ensuring a seamless transition.
- 5+ years working in an MLOps or related ML Engineering role.
- Production experience self‑hosting & operating models at scale via an inference framework such as Ray or KServe (or similar).
- Production experience with running and tuning specialized hardware, especially GPUs via CUDA.
- Nice‑to‑have:
Production experience self‑hosting inference for LLMs in production. - Measured and articulate written and spoken communication skills. You work well with others and can craft concise and expressive thoughts into correspondence: emails, issues, investigations, documentation, onboarding materials, and so on.
- An interest in learning new tools, workflows and philosophies that can help you grow. You can function well in an environment that drives towards change. This role has tremendous opportunities for growth!
- Competitive pay based on the work you do here and not your previous salary.
- Health coverage for you and your family in many locations.
- Ability to craft your calendar with flexible locations and schedules for many roles.
- Generous number of vacation days each year.
- We match up to $2000 (or local currency equivalent) for financial donations and service.
- Up to 40 hours each year to use toward volunteer projects you love.
- Embracing parenthood with a minimum of 16 weeks of parental leave.
Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, disability status, or any other basis protected by federal, state or local law, ordinance or regulation.
We welcome individuals with disabilities and…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).