×
Register Here to Apply for Jobs or Post Jobs. X

Core Engineering - ETO - Machine Learning Engineer - Associate​/VP - Bengaluru Bengaluru · India

Job in New York, New York County, New York, 10261, USA
Listing for: Goldman Sachs Bank AG
Full Time position
Listed on 2025-12-02
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer
Job Description & How to Apply Below
Position: Core Engineering - ETO - Machine Learning Engineer - Associate/VP - Bengaluru Bengaluru · India[...]
Location: New York

What We Do

At Goldman Sachs, our Engineers don’t just make things – we make things possible. Change the world by connecting people and capital with ideas. Solve the most challenging and pressing engineering problems for our clients. Join our engineering teams that build massively scalable software and systems, architect low latency infrastructure solutions, proactively guard against cyber threats, and leverage machine learning alongside financial engineering to continuously turn data into action.

Create new businesses, transform finance, and explore a world of opportunity at the speed of markets.

Who We Look For

Goldman Sachs Engineers are innovators and problem-solvers, building solutions in risk management, big data, mobile and more. We look for creative collaborators who evolve, adapt to change and thrive in a fast-paced global environment.

Business Unit Overview

Enterprise Technology Operations (ETO) is a Business Unit within Core Engineering focused on running scalable production management services with a mandate of operational excellence and operational risk reduction achieved through large scale automation, best-in-class engineering, and application of data science and machine learning. The Production Runtime Experience (PRX) team in ETO applies software engineering and machine learning to production management services, processes, and activities to streamline monitoring, alerting, automation, and workflows.

Team

Overview

The Machine Learning and Artificial Intelligence team in PRX applies advanced ML and GenAI to reduce the risk and cost of operating the firm’s large-scale compute infrastructure and extensive application estate. Building on strengths in statistical modelling, anomaly detection, predictive modelling, and time-series forecasting, we leverage foundational LLM Models to orchestrate multi-agent systems for automated production management services. By unifying classical ML with agentic AI, we deliver reliable, explainable, and cost-efficient operations at scale.

Role

and Responsibilities

In this role, you will be responsible for launching and implementing GenAI agentic solutions aimed at reducing the risk and cost of managing large-scale production environments with varying complexities. You will address various production runtime challenges by developing agentic AI solutions that can diagnose, reason, and take actions in production environments to improve productivity and address issues related to production support.

What

you’ll do
  • Build agentic AI systems:
    Design and implement tool-calling agents that combine retrieval, structured reasoning, and secure action execution (function calling, change orchestration, policy enforcement) following MCP protocol. Engineer robust guardrails for safety, compliance, and least-privilege access.
  • Productionize LLMs:
    Build evaluation framework for open-source and foundational LLMs; implement retrieval pipelines, prompt synthesis, response validation, and self-correction loops tailored to production operations.
  • Integrate with runtime ecosystems:
    Connect agents to observability, incident management, and deployment systems to enable automated diagnostics, runbook execution, remediation, and post-incident summarization with full traceability.
  • Collaborate directly with users:
    Partner with production engineers, and application teams to translate production pain points into agentic AI roadmaps; define objective functions linked to reliability, risk reduction, and cost; and deliver auditable, business-aligned outcomes.
  • Safety, reliability, and governance:
    Build validator models, adversarial prompts, and policy checks into the stack; enforce deterministic fallbacks, circuit breakers, and rollback strategies; instrument continuous evaluations for usefulness, correctness, and risk.
  • Scale and performance:
    Optimize cost and latency via prompt engineering, context management, caching, model routing, and distillation; leverage batching, streaming, and parallel tool-calls to meet stringent SLOs under real-world load.
  • Build a RAG pipeline:
    Curate domain-knowledge; build data-quality validation framework; establish feedback loops and milestone framework maintain…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary