Senior Engineer - Agentic Runtime Safety, Stability & Observability
Listed on 2025-12-27
-
IT/Tech
Systems Engineer, AI Engineer
Overview
Keysight is at the forefront of technology innovation, delivering breakthroughs and trusted insights in electronic design, simulation, prototyping, test, manufacturing, and optimization. Our ~15,000 employees create world-class solutions in communications, 5G, automotive, energy, quantum, aerospace, defense, and semiconductor markets for customers in over 100 countries. Learn more about what we do.
Our award‑winning culture embraces a bold vision of where technology can take us and a passion for tackling challenging problems with industry‑first solutions. We believe that when people feel a sense of belonging, they can be more creative, innovative, and thrive at all points in their careers.
About the InitiativeKeysight's Applied AI Autonomy Initiative is developing a next‑generation agentic orchestration framework that enables AI agents to reason, adapt, and coordinate across complex engineering workflows. Built on Lang Graph and reinforcement‑inspired feedback mechanisms
, this framework transforms prompts and design intents into executable orchestration strategies that evolve autonomously through iterative simulation and validation loops.
Our ambition is not merely to replicate human reasoning, but to push past human limits - enabling agentic systems to explore design spaces, optimize engineering workflows, and evolve orchestration strategies at a scale and speed no human could achieve.
This role defines the safety, stability, and observability architecture underpinning Keysight's agentic runtime – the layer that ensures AI‑driven orchestration remains interpretable, reversible, and aligned with human intent
. You will design the mechanisms that make autonomy trustworthy: guardrails, rollback systems, introspection APIs, and adaptive feedback loops governing every agentic decision and simulator interaction.
Role Overview
As the Senior Agentic Runtime Safety & Stability Engineer
, you will own the resilience and transparency backbone of Keysight's multi‑agent orchestration stack.
You will architect the runtime contracts, monitoring systems, and adaptive control mechanisms that ensure:
- Every AI‑driven orchestration step is safe, auditable, and predictable
- The system can detect, explain, and recover from unsafe or emergent behaviours
- Human intent is faithfully interpreted and securely executed
- Closed‑loop interactions between LLM‑based agents, reinforcement learning systems, and EDA simulators are continuously monitored and governed
This position bridges AI reasoning, runtime systems engineering, and control safety – creating a foundation where autonomous orchestration is both powerful and predictable.
Core Responsibility Domains- Architect runtime guardrails and authorization layers ensuring that agent actions remain aligned with operator intent, policy boundaries, and simulation constraints.
- Implement intent validation
, semantic disambiguation
, and prompt safety checks before orchestration execution. - Define structured safety contracts governing valid operating ranges, escalation paths, and rollback logic.
- Integrate safety constructs into orchestration semantics and graph‑based reasoning flows with the Agentic Framework Architect.
- Design deterministic rollback and checkpointing mechanisms to restore stable orchestration states after failure and enable automatic recovery paths for misaligned or unsafe agent behaviour.
- Engineer fault‑isolation boundaries to contain local agent or simulator errors and prevent systemic instability.
- Build sandboxed execution environments for validating AI‑generated orchestration logic safely.
- Develop interoperability safety layers between Python and RL technologies to ensure reliable data exchange and robust error containment in simulation‑driven loops.
- Implement comprehensive observability pipelines capturing agent reasoning traces, simulation telemetry, and orchestration health metrics.
- Create real‑time anomaly detection and confidence‑scored safety gating to monitor drift, misalignment, or policy violations.
- De…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).