Agentic AI & LLM Software Development Engineer,Senior Job Melbourne area,Florida USA,Software Development

Agentic AI & LLM Applications Software Development Engineer, Senior

The Opportunity:

To achieve an organization's mission, leaders need strong team members who can build the next generation of agentic AI to transform how clients accelerate research, makes decisions, and ships products t is why we need you, an experienced Software Development Engineer who can operate at a system-of-systems level to support clients in advancing AI-enabled systems within an R&D environment.

As part of our team, you'll serve as a Software Development Engineer to the Advanced Research Projects Agency for Health (ARPA-H). ARPA-H has a small team that is building the next generation of agentic AI to transform how the agency accelerates research, makes decisions, and ships products team will evolve ARPA-H's production AI assistant into an ecosystem of autonomous, multi-agent systems.

You'll serve as a Software Development Engineer at the application layer to design and build agentic workflows, build LLM integrations, support tool-calling systems, and develop AI-powered features that users interact with every day. Your focus will be on what runs on top of the platform: the agents, the orchestration, the prompts, the pipelines, and the product. Your attention to detail, flexibility, communication skills, understanding of the client's mission, and problem-solving will enable the mission's success.

What You'll Work On

* Support agentic AI systems and orchestration, LLM application development, features and products, observability and reliability, and engineering excellence

* Design and build core agentic workflows: multi-step reasoning, planning, memory, and tool-use across single and multi-agent systems

* Implement and evolve A2A communication patterns at the application layer, enabling agents to collaborate and hand off tasks, and build and maintain the tool-calling layer, including tool definitions, input and output schemas, error handling, retry logic, and result formatting

* Own the MCP client-side integration, including how agents discover, invoke, and compose tools exposed via MCP servers

* Design multi-agent workflows that are reliable, observable, and debuggable in production, not just in demos

* Own LLM orchestration at the application layer, including prompt construction, context management, model selection logic, and response parsing

* Build and maintain RAG features, including query formulation, result ranking, citation grounding, and hallucination mitigation; implement and iterate on prompt engineering patterns and system prompts that drive GRACE's quality and consistency across OpenAI GPT, Anthropic Claude, and Google Gemini

* Manage context window budgets and know when to truncate, summarize, or paginate, and build the logic that makes those decisions correctly

* Build evaluation pipelines for LLM quality, including grounding assessment, regression testing, safety checks, and A/B experimentation on prompt and model changes

* Stay sharp on token economics and write prompts and pipelines that are cost-efficient without sacrificing output quality

* Translate ambiguous product requirements into clear technical designs and ship them fast, build new product capabilities end-to-end, including from backend application logic through to the API contract the frontend consumes, and rapidly prototype new agentic features, run experiments, collect data, and iterate based on real user behavior

* Collaborate closely with product, UX, applied science, and operations, write tests, handle edge cases, and make sure features degrade gracefully when upstream dependencies fail

* Instrument agentic workflows with tracing, logging, and metrics so failures are diagnosable and regressions are caught before users report them

* Define and monitor application-level SLOs: tool call success rates, response quality, and latency from the user's perspective, build fallback and guardrail logic for AI services, including what happens when a model returns something unsafe, off-topic, or structurally wrong, and work closely with the infra engineer to understand system-level constraints and design application behavior that respects them

* Write production-quality code: readable, tested, reviewed, and documented

* Communicate technical decisions clearly to both engineers and non-engineers; no one should have to guess what you decided or why, participate actively in design reviews, and push back when something is over-engineered or under-specified

* Ensure strong privacy, security, and compliance in all application logic and data handling

Join us. The world can't wait.

You have:

* 7+ years of experience with software engineering, including building and operating production systems

* Experience in high-velocity environments where you owned and shipped complex products end-to-end

* Experience with at least 2 backend languages, including Python

* Experience building and operating systems on major cloud platforms, such as AWS, GCP, or Azure

* Experience with containerization and working…

Agentic AI & LLM Software Development Engineer, Senior