×
Register Here to Apply for Jobs or Post Jobs. X

AI Engineer

Job in New York City, Richmond County, New York, USA
Listing for: Inizio Partners
Full Time position
Listed on 2026-06-01
Job specializations:
  • Software Development
    AI Engineer (Applied/Software), Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 160000 USD Yearly USD 160000.00 YEAR
Job Description & How to Apply Below
About the job AI Engineer

Job Title: AI Engineer Agentic & RAG Systems

Location: Remote

Department: AI & Data Platforms

Compensation: Up to $160k Base + 12% bonus

About the Role

As an AI Engineer, you will design, build, and operate Agentic AI systems end-to-end from concept to production. You'll work on multi-agent orchestration, Retrieval-Augmented Generation (RAG), evaluation frameworks, and AI guardrails to build safe, reliable, and high-performing systems.

You will collaborate cross-functionally with product, ML, and design teams bringing ideas to life through strong engineering execution, clear communication, and a low-ego, problem-solving mindset.

Key Responsibilities

1. RAG Development & Optimization
  • Design and implement Retrieval-Augmented Generation pipelines to ground LLMs in enterprise or domain-specific data.
  • Make strategic decisions on chunking strategy, embedding models, and retrieval mechanisms to balance context precision, recall, and latency.
  • Work with vector databases (Qdrant, Weaviate, pgvector, Pinecone) and embedding frameworks (OpenAI, Hugging Face, Instructor, etc.).
  • Diagnose and iterate on challenges like chunk size trade-offs, retrieval quality, context window limits, and grounding accuracy using structured evaluation and metrics.
2. Chatbot Quality & Evaluation Frameworks
  • Establish comprehensive evaluation frameworks for LLM applications, combining quantitative (BLEU, ROUGE, response time) and qualitative methods (human evaluation, LLM-as-a-judge, relevance, coherence, user satisfaction).
  • Implement continuous monitoring and automated regression testing using tools like Lang Smith, Lang Fuse, Arize, or custom evaluation harnesses.
  • Identify and prevent quality degradation, hallucinations, or factual inconsistencies before production release.
  • Collaborate with design and product to define success metrics and user feedback loops for ongoing improvement.
3. Guardrails, Safety & Responsible AI
  • Implement multi-layered guardrails across input validation, output filtering, prompt engineering, re-ranking, and abstention (I dont know) strategies.
  • Use frameworks such as Guardrails AI, NeMo Guardrails, or Llama Guard to ensure compliance, safety, and brand integrity.
  • Build policy-driven safety systems for handling sensitive data, user content, and edge cases with clear escalation paths.
  • Balance safety, user experience, and helpfulness, knowing when to block, rephrase, or gracefully decline responses.
4. Multi-Agent Systems & Orchestration
  • Design and operate multi-agent workflows using orchestration frameworks such as Lang Graph, Auto Gen, CrewAI, or Haystack.
  • Coordinate routing logic, task delegation, and parallel vs. sequential agent execution to handle complex reasoning or multi-step tasks.
  • Build observability and debugging tools for tracking agent interactions, performance, and cost optimization.
  • Evaluate trade-offs around latency, reliability, and scalability in production-grade multi-agent environments.
Minimum Qualifications
  • Strong proficiency in Python (FastAPI, Flask, asyncio) and GCP experience is good to have
  • Demonstrated hands-on RAG implementation experience with specific tools, models, and evaluation metrics.
  • Practical knowledge of agentic frameworks (Lang Graph, Lang Chain) and evaluation ecosystems (Lang Fuse, Lang Smith).
  • Excellent communication skills, proven ability to collaborate cross-functionally, and a low-ego, ownership-driven work style.
Preferred / Good-to-Have Qualifications
  • Experience in traditional AI/ML workflows e.g., model training, feature engineering, and deployment of ML models (scikit-learn, Tensor Flow, PyTorch).
  • Familiarity with retrieval optimization, prompt tuning, and tool-use evaluation.
  • Background in observability and performance profiling for large-scale AI systems.
  • Understanding of security and privacy principles for AI systems (PII redaction, authentication/authorization, RBAC)
  • Exposure to enterprise chatbot systems, LLMOps pipelines, and continuous model evaluation in production.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary