Generative AI Engineer Job Hyderabad area,Telangana India,IT/Tech

Senior AIML Python Engineer
Exp: 6-14 Years

Location:

Hyderabad [Madhapur]
Work mode: 2 Days WFO
NP:
Immediate/15 Days
Mandatory

Skills:

AIML, LLM, Core AI, Azure, MCP, RAG, Docket/Kubernates

Responsibilities:

• End-to-end design, development, and deployment of enterprise-grade AI solutions leveraging Azure AI, Google Vertex AI, or comparable cloud platforms.

• Architect and implement advanced AI systems, including agentic workflows, LLM integrations, MCP-based solutions, RAG pipelines, and scalable microservices.

• Oversee the development of Python-based applications, RESTful APIs, data processing pipelines, and complex system integrations.

• Define and uphold engineering best practices, including CI/CD automation, testing frameworks, model evaluation procedures, observability, and operational monitoring.

• Partner closely with product owners and business stakeholders to translate requirements into actionable technical designs, delivery plans, and execution roadmaps.

• Provide hands-on technical leadership, conducting code reviews, offering architectural guidance, and ensuring adherence to security, governance, and compliance standards.

• Communicate technical decisions, delivery risks, and mitigation strategies effectively to senior leadership and cross-functional teams.

Required

Skills & Experience:

LLM & Core AI

• Strong understanding of transformers (attention, tokens, context window) and LLM behavior.

• Hands-on with 2+ LLM providers (e.g., Azure OpenAI + Anthropic / open source like Llama/Qwen).

• Experience tuning decoding parameters and handling context window limits (truncation, sliding window, summarization).

Prompting & Context Engineering

• Proven experience designing multi-layer prompts (system/policy, task, user, tools, retrieved context).

• Built context builders that select relevant history (recency + semantic) and inject tool + RAG outputs.

• Implemented context compression (conversation/memory summarization) and structured outputs (JSON/schema) with robust error handling.

Tools, MCP & External Integrations

• Designed and implemented LLM tools/function schemas with validation, clear errors, and safe side-effects.

• Hands-on experience with MCP (Model Context Protocol): building MCP servers/tools for internal data and actions, including auth and multi-tenant isolation.

• Experience integrating REST/SQL/sandboxed execution tools and defining fallback/degradation strategies when tools fail.

Agentic Systems, Orchestration & A2A

• Built multi-step agentic workflows: plan → tool calls → intermediate decisions → final answer.

• Practical use of agent roles (Planner / Worker / Critic / Router / Supervisor).

• Hands-on with A2A (Agent-to-Agent) collaboration where specialist agents exchange structured state.

• Experience with at least one agentic/workflow framework (e.g., Lang Graph, Lang Chain agents, Google ADK, Orkes Conductor, Temporal) and check pointed, resumable flows (Postgres/Redis).

RAG & Knowledge Orchestration

• Delivered end-to-end RAG systems: ingestion → chunking → embedding → indexing → retrieval → synthesis.

• Implemented hybrid search (vector + keyword + filters) over enterprise sources (PDF, HTML, Confluence/SharePoint, SQL).

• Experience with query rewriting/expansion and grounded answers with citations, including debugging retrieval quality.

Reasoning, Evaluation & Guardrails

• Implemented ReAct-style and tool-augmented reasoning patterns, including self-critique/second-pass flows.

• Defined task-level success metrics and built golden test flows from real logs to evaluate prompt/model/flow changes.

• Instrumented telemetry for tool errors, step counts, loops, latency, and cost (tokens, per feature/tenant).

• Implemented guardrails: prompt-injection defenses, per-tenant/per-role tool & data access, input/output filtering, PII-safe logging, and participated in red teaming/adversarial testing.

Model, Cost & Performance Engineering

• Experience choosing and combining small router/classifier models with large reasoning models.

• Implemented caching (LLM outputs, retrieval results) and optimized latency (parallelization, step count, time budgets).

• Built or contributed to cost/usage monitoring for LLM and agent…


Increase/decrease your Search Radius (miles)



Job Posting Language