Software Developer; Agentic Evaluation Job Toronto area,Ontario Canada,Software Development

Position: Software Developer (Agentic Evaluation)
## Software Developer (Agentic Evaluation)
Apply locations:
Toronto, ON, CAN:
Vancouver, BC, CAN:
Montreal, QC, CANtime type:
Full time posted on:
Posted Yesterday job requisition :
26WD96920
** Job

Requisition #
** 26WD96920
* * Position Overview
** As a Software Developer on the Fusion platform services team within Product Development and Manufacturing Solutions (PDMS), you'll be part of a team of technologists dedicated to creating cutting-edge AI and generative AI solutions that enhance developer productivity and experience. You'll work closely with AI engineers, software architects, and product engineering teams to build and rigorously evaluate intelligent agentic systems — including benchmarking AI agents against commercial solvers — and develop MCP (Model Context Protocol)-based tooling that integrates seamlessly with IDEs such as VS Code and Cursor.
** Responsibilities
* ** Develop and orchestrate multi-agent AI systems for automated test generation, test execution, and end-to-end development workflow optimization using frameworks like Lang Graph, Auto Gen, or the Anthropic Agent SDK (Claude Code)
* Design and implement agentic workflows that coordinate multiple AI agents to autonomously drive test automation across UI, API, integration, and system levels, from test case synthesis to result evaluation, ensuring seamless integration with existing developer tools and MCP-compatible services
* Build evaluation frameworks and custom benchmarks for agentic systems, including comparisons of AI agents against commercial solvers, using tools like Agent Bench and Langfuse
* Evaluate MCP server and tool performance across agentic pipelines, measuring latency, accuracy, context fidelity, and end-to-end task completion rates
** Minimum Qualifications
*** BS/MS in Computer Science, Machine Learning, or a related applied AI field
* Expertise in Python and ML frameworks (PyTorch, Transformers, scikit-learn)
* Experience with Large Language Models applied to software understanding or test generation
* Knowledge of AI evaluation methodologies and metrics for agentic task completion and test quality
* Strong foundation in statistical analysis and experimental design
* Experience with developer workflow and productivity measurement frameworks
** Preferred Qualifications
*** Background in software engineering or QA with close collaboration with development teams
* Familiarity with test automation frameworks (e.g., Playwright, Selenium, Pytest, Appium) and CI/CD pipelines
* Experience designing benchmarks that compare AI agents against commercial or domain-specific solvers
* Hands-on experience with MCP (Model Context Protocol), building, evaluating, and optimizing MCP servers and tool integrations within agentic pipelines
* Experience with agentic AI frameworks including Lang Graph, Auto Gen, or the Anthropic Agent SDK / Claude Code
* Knowledge in vision-language models or multi-modal AI for UI and system-level understanding and evaluation
* Experience with Azure AI Foundry/ML or AWS cloud ML platforms _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
** Aperçu du poste
** En tant que développeur logiciel au sein de l’équipe des services de la plateforme Fusion, dans le groupe Développement de produits et solutions de fabrication (PDMS), vous ferez partie d’une équipe de technologues dédiée à la création de solutions d’IA et d’IA générative de pointe visant à améliorer la productivité et l’expérience des développeurs. Vous collaborerez étroitement avec des ingénieurs en IA, des architect es logiciels et des équipes d’ingénierie produit afin de concevoir et d’évaluer rigoureusement des systèmes intelligents agentiques — notamment en comparant les agents d’IA à des solveurs commerciaux — et de développer des outils basés sur le MCP (Model Context Protocol) qui s’intègrent harmonieusement à des environnements de développement intégrés (IDE) comme VS Code et Cursor.
** Responsabilités
*** Développer et orchestrer des systèmes d’IA multi-agents pour la génération automatisée de tests, l’exécution de tests et l’optimisation des flux de…