Lead Software Engineer - Java/Python,LLM,AWS Job Bangalore area,Bengaluru Karnataka India,IT/Tech

Location: Bengaluru

Job Description

Be part of a team that is pushing the boundaries of what's possible.

As a Lead Software Engineer at JPMorgan Chase within the Commercial & Investment Bank's Risk Central Technology team, you are an integral member of an agile team building secure, stable, and scalable LLM enabled solutions. As a core technical contributor, you design and deliver controlled, well understood LLM assisted components and multi agent workflows across multiple business functions in support of the firm's objectives in a regulated environment

Job Responsibilities

Execute creative LLM assisted software solutions, design, develop, and troubleshoot LLM powered applications and services (e.g., retrieval augmented generation, agent workflows, structured extraction, classification) with a willingness to think beyond routine approaches to break down technical problems and deliver measurable outcomes and think in the novel Agentic AI way.
Develop data quality rules and controls using LLM, define and enforce guardrails for prompts, retrieved context, model inputs/outputs, and post processing, including PII redaction, toxicity/safety filters, hallucination mitigation, output schema validation, and policy compliance.
Provide Level 3 (L3) support for LLM assisted production systems, own complex incidents, model and prompt rollouts/rollbacks, dependency issues (vector stores, embeddings, feature stores), and ensure high availability, reliability, and adherence to SLAs including latency and cost budgets.
Support BAU operations for Markets businesses: maintain and evolve LLM use cases supporting markets workflows with disciplined change management, canary releases, A/B tests, and close partnership with product, controls, and operations.
Create secure, high quality production code: implement LLM assisted micro services, synchronous and asynchronous inference pipelines (streaming where appropriate), deterministic fallbacks, circuit breakers, and observability for reliability in production.
Produce architecture and design artifacts, deliver model cards, system/data lineage, RAG/agent reference architectures, prompt libraries and versioning strategies, evaluation plans, and control evidence ensuring design constraints and regulatory expectations are met during development.
Identify hidden problems and patterns, use telemetry, error analysis, prompt and context analytics, and drift detection to improve model selection, prompt strategies, retrieval quality, chunking/embedding strategies, and system architecture.
Drive LLM Ops best practices, integrate models, prompts, and evaluation into CI/CD, enforce approvals, segregation of duties, and reproducibility, automate regression and guardrail tests and manage lifecycle across environments.
Ensure that model strengths, limitations, and risk profiles are understood, documented, and appropriately applied across different classes of software work, and maintain deep understanding of the strengths, limitations, and risk characteristics of approved LLMs (e.g., Claude, ChatGPT, and successor models), including safety profiles, context limits, determinism strategies, and fine tuning vs. prompt only tradeoffs, design multi agent workflows that incorporate LLM driven analysis, code generation, testing, and review with explicit human approval gates and segregation of duties.
Ensure LLM driven systems meet enterprise reliability and resilience expectations, including disaster recovery, fallback behaviors, regional resiliency, and performance SLOs.

Required Qualifications , Capabilities, And Skills

Formal training or certification on software engineering concepts and 5+ years applied experience
Strong coding skills in Java/Python and SQL, applied to building LLM enabled micro services, retrieval pipelines, evaluators, and data tooling; solid understanding of data structures, algorithms, and object oriented programming as applied to LLM latency, caching, and throughput.
Hands on experience with AWS and cloud data management (e.g., Redshift, Dynamo DB, Aurora, Data bricks), plus experience integrating managed model endpoints and embedding/vector services; familiarity with secure secret management, networking, and least privilege access.
Proficiency in automation, CI/CD, and agile methodologies with LLM Ops extensions: prompt and config versioning, automated evaluations, canary releases, and rollback strategies.
Experience in system design, application development, and operational stability for LLM architectures, including retrieval layers, vector stores, caching, observability, rate limiting, and back pressure strategies.
Strong analytical, problem solving, and communication skills, including the ability to explain model behaviors, tradeoffs, and control decisions to both technical and non technical stakeholders.
Provide L3 and BAU support for Markets by leveraging LLMs for incident triage, run book retrieval, and pre approved auto remediation, with on call coverage for LLM services and dependencies.
Expert-level knowledge…

Lead Software Engineer - Java​/Python, LLM, AWS

Lead Software Engineer - Java/Python, LLM, AWS