Keysight Technologies SAles Spain SL.
Senior Engineer – Agentic Runtime Safety, Stability & Observability
Keysight Technologies SAles Spain SL., Calabasas, California, United States, 91302
Overview
Keysight is at the forefront of technology innovation, delivering breakthroughs and trusted insights in electronic design, simulation, prototyping, test, manufacturing, and optimization. Our ~15,000 employees create world‑class solutions in communications, 5G, automotive, energy, quantum, aerospace, defense, and semiconductor markets for customers in over 100 countries.
Our award‑winning culture embraces a bold vision of where technology can take us and a passion for tackling challenging problems with industry‑first solutions. We believe that when people feel a sense of belonging, they can be more creative, innovative, and thrive at all points in their careers.
About the Initiative Keysight’s Applied AI Autonomy Initiative is developing a next‑generation agentic orchestration framework that enables AI agents to reason, adapt, and coordinate across complex engineering workflows. Built on LangGraph and reinforcement‑inspired feedback mechanisms, this framework transforms prompts and design intents into executable orchestration strategies that evolve autonomously through iterative simulation and validation loops.
Our ambition is not merely to replicate human reasoning, but to push past human limits – enabling agentic systems to explore design spaces, optimize engineering workflows, and evolve orchestration strategies at a scale and speed no human could achieve.
This role defines the safety, stability, and observability architecture underpinning Keysight’s agentic runtime – the layer that ensures AI‑driven orchestration remains interpretable, reversible, and aligned with human intent. You will design the mechanisms that make autonomy trustworthy: guardrails, rollback systems, introspection APIs, and adaptive feedback loops governing every agentic decision and simulator interaction.
Responsibilities Role Overview As the Senior Agentic Runtime Safety & Stability Engineer, you will own the resilience and transparency backbone of Keysight’s multi‑agent orchestration stack.
Every AI‑driven orchestration step is safe, auditable, and predictable.
The system can detect, explain, and recover from unsafe or emergent behaviors.
Human intent is faithfully interpreted and securely executed.
Closed‑loop interactions between LLM‑based agents, reinforcement learning systems, and EDA simulators are continuously monitored and governed.
This position bridges AI reasoning, runtime systems engineering, and control safety – creating a foundation where autonomous orchestration is both powerful and predictable.
Runtime Guardrails, Intent Safety & Execution Control
Architect runtime guardrails and authorization layers ensuring agent actions remain aligned with operator intent, policy boundaries, and simulation constraints.
Implement intent validation, semantic disambiguation, and prompt safety checks before orchestration execution.
Define structured safety contracts governing valid operating ranges, escalation paths, and rollback logic.
Integrate safety constructs into orchestration semantics and graph‑based reasoning flows with the Agentic Framework Architect.
Fault Isolation, Rollback & Recovery Engineering
Design deterministic rollback and checkpointing mechanisms to restore stable orchestration states after failure and enable automatic recovery paths for misaligned or unsafe agent behavior.
Engineer fault‑isolation boundaries to contain local agent or simulator errors and prevent systemic instability.
Build sandboxed execution environments for validating AI‑generated orchestration logic safely.
Develop interoperability safety layers between Python and RL technologies to ensure reliable data exchange and robust error containment in simulation‑driven loops.
Telemetry, Observability & Introspective Diagnostics
Implement comprehensive observability pipelines capturing agent reasoning traces, simulation telemetry, and orchestration health metrics.
Create real‑time anomaly detection and confidence‑scored safety gating to monitor drift, misalignment, or policy violations.
Develop introspection APIs and dashboards exposing safety metrics, decision rationales, and performance diagnostics.
Collaborate with DevOps and Data Intelligence teams to unify telemetry across heterogeneous runtime components into a coherent monitoring fabric.
Adaptive Governance & Continuous Safety Learning
Establish adaptive feedback systems that adjust orchestration parameters based on observed performance, safety signals, and environmental dynamics.
Define self‑correcting safety policies enabling agents to learn from past instability and improve compliance autonomously.
Integrate safety scoring into promotion gates and validation workflows for runtime certification of agentic logic.
Partner with ML and validation engineers to evolve a continuous assurance pipeline that evaluates trust, stability, and interpretability over time.
Key Responsibilities
Architect and own the safety, observability, and governance layer of Keysight’s agentic orchestration runtime.
Design real‑time self‑healing and self‑correcting mechanisms that detect misalignment, autonomously mitigate instability, and restore safe operational behavior without degrading user experience.
Build deterministic rollback, checkpointing, and containment systems for multi‑agent and simulation‑based environments.
Implement multi‑layered telemetry, anomaly detection, and runtime introspection pipelines.
Integrate observability across LLM, RL, and simulation environments into a unified safety and diagnostics interface.
Collaborate cross‑functionally to embed transparency, traceability, and adaptive safety into every orchestration cycle.
Qualifications Required Qualifications
PhD or 5+ years of experience in systems reliability, safety‑critical software, or autonomous runtime engineering.
Advanced proficiency in Python and C/C++, with experience in hybrid or simulation‑based systems.
Proven expertise designing fault‑tolerant, observable, and recoverable distributed systems.
Deep proficiency with agentic orchestration frameworks (LangGraph, LangChain, or equivalents).
Strong understanding of intent alignment, policy enforcement, and execution traceability in AI automation.
Hands‑on experience implementing telemetry, monitoring, and introspection systems in complex runtime architectures.
Preferred Qualifications
Background in mission‑critical or regulated runtime systems (e.g., aerospace, industrial control, EDA, or HPC).
Experience designing semantic safety validation, policy modeling, and goal disambiguation frameworks.
Familiarity with adaptive rollback, dynamic gating, and safety scoring in multi‑agent environments.
Proficiency with Python/C++ interoperability (PyBind11, gRPC, ZeroMQ).
Understanding of deterministic simulation control and real‑time anomaly detection in hybrid AI–physics systems.
What This Role Offers
A foundational and high‑impact role defining the safety, stability, and observability backbone of Keysight’s next‑generation agentic orchestration systems.
The opportunity to engineer mission‑grade guardrails, rollback logic, and transparency mechanisms that ensure autonomous multi‑agent workflows remain predictable, interpretable, and aligned with human intent.
Direct influence on how AI agents reason, act, recover, and self‑correct within high‑assurance engineering environments – embedding trust, traceability, and adaptive safety into every orchestration cycle.
A leadership position at the intersection of runtime engineering, intelligent systems, and safety‑critical autonomy – helping shape how Keysight deploys and governs the next era of agentic intelligence.
The level of role will be based on experience, education and skills; most offers will be between the minimum and the midpoint of the Salary Range listed below.
CA pay range: Min $156,740 – Max $261,230.
Note: For other locations, pay ranges will vary by region.
This role is eligible for our Keysight Results Bonus Program.
Benefits
Medical, dental and vision
Health Savings Account
Health Care and Dependent Care Flexible Spending Accounts
Life, Accident, Disability insurance
Business Travel Accident and Business Travel Health
401(k) Plan
Flexible Time Off, Paid Holidays
Paid Family Leave
Discounts, Perks
Tuition Reimbursement
Adoption Assistance
ESPP (Employee Stock Purchase Plan)
Restricted Stock Units
Careers Privacy Statement
Equal Opportunity Keysight is an Equal Opportunity Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability or any other protected categories under all applicable laws.
EEO Statement Keysight Technologies Inc. is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability or any other protected categories under all applicable laws.
#J-18808-Ljbffr
Our award‑winning culture embraces a bold vision of where technology can take us and a passion for tackling challenging problems with industry‑first solutions. We believe that when people feel a sense of belonging, they can be more creative, innovative, and thrive at all points in their careers.
About the Initiative Keysight’s Applied AI Autonomy Initiative is developing a next‑generation agentic orchestration framework that enables AI agents to reason, adapt, and coordinate across complex engineering workflows. Built on LangGraph and reinforcement‑inspired feedback mechanisms, this framework transforms prompts and design intents into executable orchestration strategies that evolve autonomously through iterative simulation and validation loops.
Our ambition is not merely to replicate human reasoning, but to push past human limits – enabling agentic systems to explore design spaces, optimize engineering workflows, and evolve orchestration strategies at a scale and speed no human could achieve.
This role defines the safety, stability, and observability architecture underpinning Keysight’s agentic runtime – the layer that ensures AI‑driven orchestration remains interpretable, reversible, and aligned with human intent. You will design the mechanisms that make autonomy trustworthy: guardrails, rollback systems, introspection APIs, and adaptive feedback loops governing every agentic decision and simulator interaction.
Responsibilities Role Overview As the Senior Agentic Runtime Safety & Stability Engineer, you will own the resilience and transparency backbone of Keysight’s multi‑agent orchestration stack.
Every AI‑driven orchestration step is safe, auditable, and predictable.
The system can detect, explain, and recover from unsafe or emergent behaviors.
Human intent is faithfully interpreted and securely executed.
Closed‑loop interactions between LLM‑based agents, reinforcement learning systems, and EDA simulators are continuously monitored and governed.
This position bridges AI reasoning, runtime systems engineering, and control safety – creating a foundation where autonomous orchestration is both powerful and predictable.
Runtime Guardrails, Intent Safety & Execution Control
Architect runtime guardrails and authorization layers ensuring agent actions remain aligned with operator intent, policy boundaries, and simulation constraints.
Implement intent validation, semantic disambiguation, and prompt safety checks before orchestration execution.
Define structured safety contracts governing valid operating ranges, escalation paths, and rollback logic.
Integrate safety constructs into orchestration semantics and graph‑based reasoning flows with the Agentic Framework Architect.
Fault Isolation, Rollback & Recovery Engineering
Design deterministic rollback and checkpointing mechanisms to restore stable orchestration states after failure and enable automatic recovery paths for misaligned or unsafe agent behavior.
Engineer fault‑isolation boundaries to contain local agent or simulator errors and prevent systemic instability.
Build sandboxed execution environments for validating AI‑generated orchestration logic safely.
Develop interoperability safety layers between Python and RL technologies to ensure reliable data exchange and robust error containment in simulation‑driven loops.
Telemetry, Observability & Introspective Diagnostics
Implement comprehensive observability pipelines capturing agent reasoning traces, simulation telemetry, and orchestration health metrics.
Create real‑time anomaly detection and confidence‑scored safety gating to monitor drift, misalignment, or policy violations.
Develop introspection APIs and dashboards exposing safety metrics, decision rationales, and performance diagnostics.
Collaborate with DevOps and Data Intelligence teams to unify telemetry across heterogeneous runtime components into a coherent monitoring fabric.
Adaptive Governance & Continuous Safety Learning
Establish adaptive feedback systems that adjust orchestration parameters based on observed performance, safety signals, and environmental dynamics.
Define self‑correcting safety policies enabling agents to learn from past instability and improve compliance autonomously.
Integrate safety scoring into promotion gates and validation workflows for runtime certification of agentic logic.
Partner with ML and validation engineers to evolve a continuous assurance pipeline that evaluates trust, stability, and interpretability over time.
Key Responsibilities
Architect and own the safety, observability, and governance layer of Keysight’s agentic orchestration runtime.
Design real‑time self‑healing and self‑correcting mechanisms that detect misalignment, autonomously mitigate instability, and restore safe operational behavior without degrading user experience.
Build deterministic rollback, checkpointing, and containment systems for multi‑agent and simulation‑based environments.
Implement multi‑layered telemetry, anomaly detection, and runtime introspection pipelines.
Integrate observability across LLM, RL, and simulation environments into a unified safety and diagnostics interface.
Collaborate cross‑functionally to embed transparency, traceability, and adaptive safety into every orchestration cycle.
Qualifications Required Qualifications
PhD or 5+ years of experience in systems reliability, safety‑critical software, or autonomous runtime engineering.
Advanced proficiency in Python and C/C++, with experience in hybrid or simulation‑based systems.
Proven expertise designing fault‑tolerant, observable, and recoverable distributed systems.
Deep proficiency with agentic orchestration frameworks (LangGraph, LangChain, or equivalents).
Strong understanding of intent alignment, policy enforcement, and execution traceability in AI automation.
Hands‑on experience implementing telemetry, monitoring, and introspection systems in complex runtime architectures.
Preferred Qualifications
Background in mission‑critical or regulated runtime systems (e.g., aerospace, industrial control, EDA, or HPC).
Experience designing semantic safety validation, policy modeling, and goal disambiguation frameworks.
Familiarity with adaptive rollback, dynamic gating, and safety scoring in multi‑agent environments.
Proficiency with Python/C++ interoperability (PyBind11, gRPC, ZeroMQ).
Understanding of deterministic simulation control and real‑time anomaly detection in hybrid AI–physics systems.
What This Role Offers
A foundational and high‑impact role defining the safety, stability, and observability backbone of Keysight’s next‑generation agentic orchestration systems.
The opportunity to engineer mission‑grade guardrails, rollback logic, and transparency mechanisms that ensure autonomous multi‑agent workflows remain predictable, interpretable, and aligned with human intent.
Direct influence on how AI agents reason, act, recover, and self‑correct within high‑assurance engineering environments – embedding trust, traceability, and adaptive safety into every orchestration cycle.
A leadership position at the intersection of runtime engineering, intelligent systems, and safety‑critical autonomy – helping shape how Keysight deploys and governs the next era of agentic intelligence.
The level of role will be based on experience, education and skills; most offers will be between the minimum and the midpoint of the Salary Range listed below.
CA pay range: Min $156,740 – Max $261,230.
Note: For other locations, pay ranges will vary by region.
This role is eligible for our Keysight Results Bonus Program.
Benefits
Medical, dental and vision
Health Savings Account
Health Care and Dependent Care Flexible Spending Accounts
Life, Accident, Disability insurance
Business Travel Accident and Business Travel Health
401(k) Plan
Flexible Time Off, Paid Holidays
Paid Family Leave
Discounts, Perks
Tuition Reimbursement
Adoption Assistance
ESPP (Employee Stock Purchase Plan)
Restricted Stock Units
Careers Privacy Statement
Equal Opportunity Keysight is an Equal Opportunity Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability or any other protected categories under all applicable laws.
EEO Statement Keysight Technologies Inc. is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, protected veteran status, disability or any other protected categories under all applicable laws.
#J-18808-Ljbffr