Wealth - Lead - GenAI Testing and Evaluation Framework - Vice President
Listed on 2026-01-03
-
Software Development
AI Engineer, Machine Learning/ ML Engineer
We are seeking an innovative and detail-oriented professional to lead the development and management of the Generative AI (GenAI) testing and evaluation framework. This role focuses on creating patterns, methodologies, and iterative structures to optimize the performance and effectiveness of GenAI models, with a particular emphasis on prompt engineering and evaluation. The ideal candidate will have a strong background in GenAI, a deep understanding of natural language processing, and a passion for refining AI solutions through rigorous testing and iteration.
Key Responsibilities Framework DevelopmentDesign and implement a comprehensive testing and evaluation framework for GenAI model outputs.
Develop standards and patterns for assessing the quality and "goodness" of prompts across diverse use cases.
Create iterative processes for testing and refining prompts to optimize model outputs.
Establish criteria for evaluating prompt performance, including accuracy, completeness, relevance, coherence, and alignment with desired outcomes.
Experiment with prompt structures to identify optimal configurations for various business applications.
Develop and document best practices for prompt design and refinement.
Work closely with tech partners, engineers, and product teams to ensure testing frameworks integrate seamlessly into the development lifecycle.
Partner with stakeholders to understand business requirements and tailor testing methodologies to address specific needs.
Provide actionable insights and recommendations to improve model performance based on evaluation results.
Identify and implement tools for automating the testing and evaluation process.
Develop dashboards and reporting mechanisms to monitor prompt and model performance metrics.
Stay updated on emerging tools and techniques in AI testing and integrate them into the framework.
Establish feedback loops to iteratively improve testing methodologies and evaluation standards.
Establish process for ongoing monitoring of prompts, once productionalized.
Monitor industry trends and advancements in Generative AI to ensure the framework remains cutting‑edge.
Advocate for a culture of experimentation and continuous learning within the organization.
Expertise in Generative AI and natural language processing (NLP) models.
Strong proficiency in prompt engineering and familiarity with frameworks for AI evaluation.
Hands‑on experience with AI tools, libraries, and cloud platforms.
Strong problem‑solving skills and ability to derive actionable insights from complex data.
Attention to detail with a focus on precision and accuracy in evaluation.
Deep understanding of AI/ML testing methodologies and best practices.
Proficiency in programming languages like Python and experience with relevant libraries (e.g., PyTorch, Tensor Flow).
Passion for exploring new methodologies to improve AI evaluation frameworks.
Creativity in designing experiments and testing approaches.
Excellent communication skills to convey technical concepts to diverse audiences.
Ability to work collaboratively across cross‑functional teams and influence stakeholders.
Comfortable working in a fast‑paced, dynamic environment.
Willingness to learn and adapt to new tools, technologies, and methodologies.
Lead the development of a transformative AI testing framework in a forward-thinking organization.
Work with cutting‑edge technologies and a team of passionate innovators.
Contribute to impactful projects that shape the future of Generative AI.
Enjoy competitive compensation and benefits with opportunities for professional growth.
If you are driven to refine and optimize AI solutions through innovative testing frameworks and have the expertise to lead this effort, we encourage you to apply.
Compensation and BenefitsPrimary Location Full Time Salary Range: $ - $. In addition to salary, Citi offers competitive employee benefits, including medical, dental & vision…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).