Software Engineer; Agentic Systems – AI Software Delivery
Listed on 2026-05-07
-
Software Development
DevOps, Software Architect, Cloud Engineer - Software, Software Engineer
Job Title: Staff Software Engineer (Agentic Systems) – AI for Software Delivery
Job Type: Full-Time
Location: Remote - Australia
About Cloud Bees
Cloud Bees helps enterprises deliver secure, compliant software sit at the center of the software delivery lifecycle, from Jenkins-based CI to modern cloud-native Dev Sec Ops platforms.
Our focus is simple: make software delivery faster, safer, and more reliable for large engineering organizations.
About the Role
We’re embedding AI across the entire software development lifecycle to reduce the time it takes to understand, triage, and resolve problems - from planning and code, through CI/CD, to production systems.
As a Staff Engineer, you will lead the design and evolution of agentic systems that operate across the SDLC. These systems ingest signals from multiple sources – code, pull requests, tests, pipelines, incidents, and documentation – and reason over them to provide high‑confidence insights and, over time, take safe, constrained actions.
This is not just about adding AI to existing workflows. It’s about defining how intelligent systems participate in software delivery, where context is fragmented, systems are interdependent, and correctness matters.
You will work across team and product boundaries to:
- Define architectures for cross‑SDLC agentic workflows
- Establish patterns for reasoning, context synthesis, and tool orchestration
- Connect signals across traditionally siloed systems (e.g. code, CI, incidents, docs)
- Drive the evolution from advisory systems to controlled, auditable automation
- Ensure these systems are reliable, observable, and safe in production environments
This role requires both deep technical execution and strong technical leadership, including shaping direction, influencing multiple teams, and raising the bar for how AI systems are built and operated.
What You’ll Do
- Lead the design of AI‑driven systems that triage, explain, and eventually remediate CI/CD failures across products
- Define and evolve agent architectures, including reasoning strategies, tool orchestration, and context management
- Drive cross‑team technical initiatives, aligning platform, product, and infrastructure components
- Translate ambiguous problem spaces into well‑defined systems and execution plans
- Establish engineering standards and patterns for:
- agent reliability and correctness
- evaluation and benchmarking
- observability and tracing
- safety and guardrails
- Improve system performance across:
- reasoning quality
- tool‑call accuracy
- latency and cost efficiency
- Mentor engineers and raise the technical bar across the team
- Partner with product and leadership to shape roadmap and technical strategy
Contribute to build‑vs‑buy decisions, platform direction, and long‑term architecture.
What We’re Looking For
Required
- 8+ years of software engineering experience, including experience operating at a strong senior or staff level
- Proven experience designing and delivering complex, distributed systems in production
- Strong programming skills in Python, Go, or Type Script
- Deep experience with cloud‑native architectures (AWS or GCP)
- Hands‑on experience building LLM‑based or AI‑assisted systems, including:
- prompt and context design
- tool integration and orchestration
- evaluation frameworks and metrics
- guardrails and safety mechanisms
- Experience defining and operating systems with:
- high reliability and availability
- strong observability (metrics, tracing, logging)
- clear failure modes and recovery strategies
- Demonstrated ability to:
- lead technical initiatives across teams
- operate effectively in ambiguous problem spaces
- balance speed of iteration with production quality
Preferred
- Experience building agentic systems or multi‑step reasoning workflows
- Familiarity with CI/CD ecosystems (Jenkins, Git Hub Actions, Argo, etc.)
- Experience with LLM evaluation frameworks, tracing tools, or RAG systems
- Understanding of trade‑offs between autonomy, safety, and control in AI systems
- Experience influencing architecture or strategy at a product or platform level
What Success Looks Like
In your first 3‑6 months, you will:
- Deliver AI‑assisted workflows that materially reduce time‑to‑triage for software failures
- Establish baseline evaluation frameworks for…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).