×
Register Here to Apply for Jobs or Post Jobs. X

Senior AWS Agentcore Platform Engineer

Job in Exton, Chester County, Pennsylvania, 19341, USA
Listing for: Apolis
Part Time position
Listed on 2026-07-01
Job specializations:
  • Software Development
    Cloud Engineer - Software, DevOps, AI Engineer (Applied/Software), Backend Developer
Job Description & How to Apply Below

Senior AWS Agentcore Platform Engineer

Position Type:
Contract to hire after initial 6 months

Location:

Reading, PA or Exton, PA (Hybrid 2-3 days a week from office)

Job Description:

1. Observability & Distributed Tracing
  • Gap Analysis:
    Assess AWS Cloud Watch, X-Ray, Bedrock logging, and Agent Core traces against agentic workflow requirements; produce a comprehensive gap analysis and lead the setup of observability within Dynatrace.
  • Validation Pipelines:
    Design and implement post-deployment validation pipelines for agents and Model Context Protocol (MCP) servers, ensuring deployment health and successful tool registration.
  • Tracing & Logging:
    Implement distributed tracing and structured logging to capture LLM decision logic, tool selections, sub-agent calls, and MCP interactions.
  • Architecture Strategy:
    Evaluate Lang Fuse and LiteLLM proxies against AWS-native solutions; deliver a target-state observability architecture recommendation.
2. Cost Tracking & TCO (Total Cost of Ownership)
  • Taxonomy Expansion:
    Extend tagging taxonomy to capture costs across agent runtimes, MCP servers, vector databases, and Bedrock token consumption per namespace.
  • Cost Modeling:
    Design a granular cost visibility model to aggregate expenses for agents, MCPs, and LLM tokens by team and department.
  • Dashboards & Alerting:
    Build Cloud Watch (or equivalent) dashboards for per-team spending; configure AWS Budgets with proactive alerting thresholds.
  • Automation:
    Automate cost reporting via email and Microsoft Teams, incorporating anomaly detection rules to identify spend spikes.
3. Monitoring & Incident Management
  • Alerting Framework:
    Define and implement P1–P4 alerting rules covering deployment failures, runtime errors, tool invocation failures, and MCP connectivity issues.
  • Incident Integration:
    Integrate alert notifications with Microsoft Teams and email, utilizing resource ownership tags for intelligent routing.
  • Operational Excellence:
    Author detailed runbooks for every alert; publish and maintain these in Confluence to facilitate developer self-service resolution.
  • Stack Evaluation:
    Compare AWS-native vs. third-party monitoring stacks to deliver a long-term recommendation aligned with the broader observability architecture.
4. Security & Governance
  • Risk Assessment:
    Evaluate current IAM and tagging strategies for multi-team isolation; identify scalability gaps and potential security risks.
  • Policy Engines:
    Assess the Cedar policy engine (Agent Core) for fine-grained tool access control and document gaps for enterprise-scale deployment.
  • Identity Architecture:
    Design a scalable Attribute-Based Access Control (ABAC) identity model to ensure multi-team isolation without IAM policy sprawl; deliver production-ready Terraform modules.
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary