×
Register Here to Apply for Jobs or Post Jobs. X

Principal Engineer

Job in Englewood, Arapahoe County, Colorado, 80111, USA
Listing for: A2Z Sync
Full Time position
Listed on 2026-06-03
Job specializations:
  • Software Development
    Cloud Engineer - Software, Software Engineer
Job Description & How to Apply Below
Why This Role Exists

We operate a multi-tenant automotive SaaS platform serving thousands of dealer groups across the United States. Our backend - event-driven serverless on AWS (Lambda, Event Bridge, Dynamo

DB, S3, Step Functions) - orchestrates everything from dealer onboarding to inventory management to real-time transaction processing. That platform works. Now we need to make it think.

We are building agentic AI systems: autonomous, tool-using agents that observe platform state, reason over dealer context, take action through production APIs, and learn from outcomes. These are not chatbots bolted onto a dashboard. They are first-class platform services - backed by AWS Bedrock, connected to production systems via MCP servers - that make decisions, execute workflows, and close loops without human intervention unless guardrails say otherwise.

This Principal Engineer owns that entire surface. You are not advising on AI strategy from a whiteboard. You are writing agent code, defining tool interfaces, building evaluation harnesses, setting cost and latency budgets, and shipping production AI workflows that touch real dealers and real money. You set the engineering patterns the team follows, you help make the build-vs-buy calls, and when an agent misbehaves at 2 AM, your architecture is what determines whether it fails safe or fails loud.
Scope & Scale
  • 5000+ destination dealer tenants, each with isolated databases and per-tenant configuration.
  • Billions in annual Gross Merchandise Value (GMV) flowing through platform transactions.
  • Tens of thousands of API requests per minute across REST, SOAP, and event-driven integration surfaces.
  • Data pipelines spanning 6 integration domains with multi-protocol vendor connectivity.
What You Will Own
  • Ownership and core development of agentic AI systems - designing, building, and operating the AI agent infrastructure (AWS Bedrock, MCP servers) that powers intelligent automation across the platform. You are not advising on AI strategy; you are writing the agent code, defining the tool interfaces, building the evaluation harnesses, and shipping production AI workflows.
  • AI agent lifecycle end to end - from prompt engineering and tool-use design through guardrails, evaluation, cost optimization, and production observability. You own the patterns the team uses to build with AI: how agents connect to production systems, how we evaluate output quality, how we manage model costs at scale, and how we roll back when an agent misbehaves.
  • System design and technical decision-making for migration waves - from identity/tenant services through core domain extraction and frontend decomposition.
  • The dual-write framework, API Gateway traffic-splitting, and per-tenant feature flag rollout that make every migration step reversible.
  • Cross-cutting concerns: observability (Open Telemetry, Cloud Watch), security posture (Auth0 consolidation, IAM), and data architecture (Dynamo

    DB single-table design, Aurora consolidation).
  • Mentoring and force-multiplying senior ICs - establishing patterns, reviewing designs, and raising the technical bar across 5 engineering teams.
  • Consolidate and strategize 30+ different integrations and make the future integrations easier.
Technical Environment
  • Cloud Services:
    High-availability AWS stack including Lambda, Event Bridge, Dynamo

    DB, S3, ECS Fargate, Aurora, API Gateway, Cloud Watch, and Secrets Manager.
  • Development

    Languages:

    Modern Python and Java (Spring Boot) alongside Type Script/React (Next.js 16) frontends, with legacy domain coverage in PHP/Laravel.
  • AI & Agentic Systems:
    Advanced agentic workflow orchestration utilizing lean AWS Bedrock Agent Core, MCP servers, or Lang Chain/Lang Graph frameworks.
  • Data Engineering:
    Complex data architectures featuring Dynamo

    DB single-table design, MySQL/Aurora, S3 data lakes, Glue Data Catalog, Athena, and Data pipelines.
  • Infrastructure & Security:
    Enterprise-grade CI/CD and observability via Cloud Formation, Auth0 consolidation, Open Telemetry, and Circle

    CI.
  • Integration Surfaces:
    Multi-protocol connectivity spanning REST, SOAP/XML, Event Bridge event-bus patterns, SES processing, and Playwright browser automation.
First 12 Months
  • Months 1-3:
    Immerse in the codebase. Audit the current architecture across all stacks. Publish the first Architecture Decision Record (ADR) for the next migration wave. Establish your design review cadence with the team.
  • Months 4-6:
    Drive the AI/agentic integration layer - Bedrock-powered automation in at least one production workflow. Establish the patterns for how the team builds with AI going forward; both agentic insight retrieval agentic workflow automation.
  • Months 7-9:
    Own and deliver the first migration wave end-to-end - from design doc through production cutover with dual-write validation. Stand up the observability baseline (Open Telemetry instrumentation, dashboards, SLOs).
  • Months 10-12:
    Second migration wave in production. Architecture runway documented for the next 12 months. The team operates at a higher technical bar…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary