Senior Principal Engineer, Agent Harness
Listed on 2026-06-04
-
Software Development
AI Engineer, Cloud Engineer - Software
We are Progress (Nasdaq: PRGS) - the trusted provider of software that enables our customers to develop, deploy and manage responsible, AI‑powered applications and experiences with agility and ease. We’re proud to have a diverse, global team where we value the individual and enrich our culture by considering varied perspectives because we believe people power progress. Join us as a Senior Principal Engineer, Agent Harness and help us do what we do best: propelling business forward.
AI is rapidly transforming the world. As generative AI reshapes industries, teams need strong ways to monitor, troubleshoot, and continuously improve their AI systems. We’re building several products in the AI Engineering space—AI observability and evaluation platform and Agentic Memory empowering AI engineers to ship high‑performing, reliable agents and LLM‑powered applications.
In this role, you will:- Lead the architecture, design and implementation of backend and platform services for AI‑powered and agentic systems.
- Design and build agentic harness tooling that enables teams to develop, evaluate, debug, deploy and operate LLM‑powered agents and workflows.
- Apply deep knowledge of AI, LLMs, model behavior, prompt/version management, RAG, tool use, MCP, memory systems, evaluation patterns and agent orchestration to real product challenges.
- Build reusable AI platform capabilities that can be adopted across several Progress products, including Agentic Memory, Observability, evaluation pipelines, developer tooling and other AI‑enabled experiences.
- Own complex technical initiatives end‑to‑end—from initial discovery and architecture to development, testing, production rollout and operational maturity.
- Write secure, testable and maintainable code while setting a high engineering bar across backend services, infrastructure and platform components.
- Solve complex engineering challenges and help shape scalable, high‑availability architectures that support production‑grade AI workloads and long‑term product growth.
- Drive Dev Ops excellence through infrastructure automation, CI/CD, deployment reliability, observability, incident readiness and cost‑aware cloud architecture.
- Establish strong observability and evaluation practices for AI systems, including tracing, metrics, structured logging, model and agent behavior analysis, feedback loops and reliability signals.
- Collaborate with product, engineering, architecture, security and Dev Ops teams to translate ambiguous AI platform needs into clear technical direction and deliverable systems.
- Lead technical discovery and prototyping in fast‑moving AI areas, then turn validated ideas into robust, supportable platform capabilities.
- Provide technical leadership across teams by mentoring engineers, guiding architectural decisions, reviewing designs and raising engineering standards.
- Stay current with the evolving AI, LLM, agentic systems and observability ecosystem, and use that knowledge to influence Progress’ platform direction.
- Operate within an established agile process that supports continuous delivery and enables rapid iterations.
- Extensive experience in software development, including strong hands‑on experience building and operating distributed backend systems in production.
- Strong experience with .NET services.
- Practical knowledge of other languages including Go, Python and Type Script.
- Expertise in containerization and orchestration, including Docker and Kubernetes or equivalent platforms.
- Experience in the backend development of database‑driven applications and SaaS products.
- Strong experience with cloud architecture and services, preferably AWS—storage, orchestration, networking, load balancing, security and cost optimization.
- Knowledge of Infrastructure as Code, preferably Terraform.
- Strong Dev Ops and platform engineering experience, including CI/CD, deployment automation, infrastructure automation and production operations.
- Experience designing systems that are observable, debuggable, secure, resilient and maintainable under production load.
- Ability to lead across teams, bring clarity to ambiguous technical problems and align stakeholders around practical solutions.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).