×
Register Here to Apply for Jobs or Post Jobs. X

EMS IoT Lead​/Architect

Job in Roswell, Fulton County, Georgia, 30076, USA
Listing for: ALTEN
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Systems Engineer, IT Support
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: EMS IoT Lead / Architect

Overview

Job Title: EMS IoT Lead / Architect (Onsite - US/EST). 📍

Location:

Atlanta, GA / Roswell, GA (Onsite - US/EST). 🕓

Experience:

10+ years. 🏢

Employment Type:

Full-time, Onsite.

About The Role
We are seeking an experienced EMS IoT Lead / Architect to provide system-level technical ownership for a large-scale connected IoT ecosystem. This role is responsible for platform stability, production issue resolution, and continuous improvement across devices, firmware, cloud services, mobile applications, and integrations. The EMS IoT Lead operates as the primary engineering authority during production incidents, leading triage, debugging, root cause analysis (RCA), fix strategy selection, and closure coordination.

The role requires close collaboration with engineering, operations, customer support, and customer-facing teams to ensure reliable and consistent user experiences.

Key Responsibilities

System-Level IoT Ownership

  • Own end-to-end technical accountability across the full IoT stack:
    • Device connectivity and telemetry
    • Firmware behavior and state management
    • Cloud services and data pipelines
    • Mobile applications and APIs
  • Understand and debug end-to-end connectivity flows from device and firmware through cloud platforms to mobile applications
  • Diagnose issues related to connectivity failures, message loss, latency, retries, state synchronization, and data inconsistencies
  • Prioritize issues based on customer impact, severity, and recurrence, not component boundaries

Incident Management & US-Time Response

  • Act as the primary engineering escalation point during US business hours
  • Lead real-time investigation and response for:
    • Production incidents
    • NOC escalations
    • Customer-facing issues
  • Evaluate and select the most appropriate resolution strategy, including:
    • Hotfixes
    • Configuration changes
    • Rollbacks
    • Permanent code fixes
  • Drive rapid mitigation to stabilize incidents while minimizing customer impact

Debugging, RCA & Resolution Leadership

  • Lead deep debugging and root cause analysis across distributed systems
  • Analyze logs, telemetry, metrics, and traces across device, cloud, and application layers
  • Determine whether issues can be resolved via:
    • Tactical fixes
    • Operational or configuration changes
    • Architectural or design changes
  • Drive fixes to completion, coordinating development, validation, deployment, and verification until issues are fully resolved in production
  • Ensure all resolved issues include clear RCA documentation and corrective actions

Cross-Functional & Offshore Team Collaboration

  • Work closely with:
    • Cloud engineering teams
    • Mobile engineering teams
    • Firmware and platform teams
  • Collaborate with offshore engineering teams, providing:
    • Clear RCA context
    • Technical direction
    • Execution priorities
  • Enable effective follow-the-sun execution while maintaining ownership and continuity

Customer Support & Stakeholder Communication

  • Partner closely with Customer Support and NOC teams during incidents and escalations
  • Communicate issue status, impact, and resolution progress clearly and consistently
  • Coordinate with Marketing and customer-facing teams to support accurate and aligned customer messaging during incidents or service degradations
  • Ensure timely and transparent communication throughout the issue lifecycle

Escalation & Governance

  • Escalate issues to core engineering or product teams only when they cannot be resolved through EMS
  • Prepare high-quality escalation packages, including:
    • Completed RCA
    • Reproduction steps
    • Impact assessment
    • Design or architectural considerations
  • Maintain tracking and visibility of escalated issues through closure

Process Improvement & Platform Stability

  • Establish and enforce standards for:
    • Issue intake quality
    • Triage consistency
    • RCA documentation
    • Closure and communication
  • Analyze trends and recurring issues to identify systemic risks
  • Drive continuous improvements to reduce incident frequency and improve platform reliability

Technology Environment

The role requires hands-on familiarity with modern IoT and cloud platforms, including:

Cloud & Platform

  • AWS (compute, networking, deployments, monitoring)
  • Clear Blade IoT Platform
  • Datadog (logs, metrics, tracing, incident analysis)
  • Mongo

    DB or similar No

    SQL databases

IoT & Messaging

  • MQTT-based device communication
  • Device…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary