×
Register Here to Apply for Jobs or Post Jobs. X

Lead Production Support Analyst

Job in Cedar Rapids, Linn County, Iowa, 52404, USA
Listing for: Transamerica
Full Time position
Listed on 2026-06-03
Job specializations:
  • IT/Tech
    IT Support, Cloud Computing
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below

Job Description

Responsibilities

  • Operational & Production Support Leadership
  • Lead day-to-day production support operations for Individual Solutions & WFG applications/services, ensuring high availability, performance, and stability.
  • Act as the accountable owner for the production support operating model, including L1/L2/L3 routing, on-call rotations, escalation paths, and SLAs/SLOs.
  • Oversee and coach a vendor/contractor support team, ensuring quality execution, clear accountability, and consistent outcomes across shifts/time zones.
  • Own application onboarding into production support: ensure runbooks, SOPs, architecture diagrams, support metrics, monitoring/alerting, access, and DR/backup readiness are complete and current.
  • Establish operational readiness standards across logging, monitoring, access controls, backup, disaster recovery, and maintenance windows.
  • Vendor Management & Service Delivery
  • Manage vendor performance (tickets, SLAs, MTTR, quality of RCAs, repeat incidents, documentation hygiene) and drive continuous service improvement.
  • Run recurring vendor governance: operational reviews, KPI scorecards, backlog prioritization, and corrective action plans.
  • Coordinate with third-party providers for escalations, service requests, planned maintenance, patching, and production changes.
  • Incident, Problem & Change Management
  • Serve as the primary escalation point for high-severity incidents; lead war rooms/bridge calls and drive timely resolution with strong communication.
  • Ensure Root Cause Analysis (RCA) and Post-Incident Reviews (PIRs) are completed with actionable remediation, prevention plans, and measurable follow-through.
  • Drive problem management: identify patterns and recurring issues using incident history, logs, and metrics; reduce repeat incidents through permanent fixes.
  • Oversee change/release execution to minimize production risk: pre-change validation, approvals, rollback plans, post-release monitoring, and “go/no-go” decision support.
  • Ensure adherence to ITSM processes and audit-ready evidence for incident/change/problem workflows.
  • Monitoring, Observability & Reliability
  • Improve detection and response through dashboards, health checks, distributed tracing/APM, synthetic monitoring, and log correlation.
  • Tune alerting to reduce noise and improve signal-to-noise; implement event correlation to prevent alert storms.
  • Partner with engineering and platform teams to define/track error (where applicable), and reliability improvements.
  • Continuous Improvement, Automation & Incident Reduction
  • Proactively identify opportunities for automation (self-healing, auto-remediation, runbook automation, standardized scripts) that reduce toil and improve MTTR.
  • Drive operational standardization: repeatable onboarding, consistent runbooks, automated checks, and common monitoring patterns.
  • Lead initiatives focused on reducing incident volume, shortening recovery times, improving release quality, and removing manual steps from common procedures.
  • Technical Environment:
  • Cloud Platforms
  • AWS: EC2, Lambda, ECS/EKS, S3, Cloud Front, Route 53, IAM, Cloud Watch, API Gateway, Secrets Manager
  • Azure:
    Virtual Machines, Azure Functions, App Service, AKS, Entra , Azure Monitor/Log Analytics, Key Vault, API Management, Azure Backup
  • Monitoring & Observability
  • App Dynamics, Splunk, Prometheus, ELK, Cloud Watch, Azure Monitor, Grafana
  • Incident & Event Management
  • Service Now (Incident/Problem/Change/Event), Big Panda, JIRA
  • Infrastructure, Middleware & Platforms
  • Linux/Windows Server fundamentals; networking basics (DNS, routing, LB, firewall rules)
  • Middleware/servers (as applicable): NGINX/Apache, Tomcat/Web Logic/JBoss, Kafka/MQ patterns
  • CI/CD & Scheduling
  • Jenkins/Git Hub Actions/Cloud pipelines (where applicable)
  • Control-M/Cron/Airflow (where applicable)
  • Security & Access
  • IAM/role-based access, certificates, secrets management, key vaults

Qualifications

  • 8+ years in production support, IT operations, cloud operations, or SRE/Platform operations, with 3+ years in a lead role (team lead, service owner, or vendor lead).
  • Strong knowledge of ITSM/ITIL practices and hands-on experience with Service Now (Inc/Prob/Chg; Event Mgmt preferred).
  • Demonstrated ability to lead…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary