DevOps Lead Engineer
Listed on 2026-02-01
-
IT/Tech
SRE/Site Reliability, IT Support
Overview
Unlock your potential with Quantum ePay® We're a full-service financial technology provider that helps businesses lower costs, earn more, and improve their quality of life. We offer truly innovative payment processing solutions and an ever-expanding line of products to boost productivity, enabling our clients to operate efficiently and effectively in confidence.
We're seeking a Dev Ops Lead Engineer to take ownership of day-to-day Dev Ops operations and production reliability for our core systems. This role is hands-on and execution-focused, responsible for ensuring system availability, leading incident response, improving observability, and driving operational maturity across our infrastructure. You'll work closely with Engineering, Product, and Support teams to keep our platforms stable, performant, and resilient as we scale.
ResponsibilitiesDev Ops & Production Operations
- Lead day-to-day Dev Ops and production support, ensuring system availability, performance, and reliability.
- Drive incident resolution, root cause analysis, and long-term remediation.
- Maintain and improve runbooks, SOPs, and escalation paths.
- Continuously reduce MTTR (mean time to resolution) through tooling, automation, and process improvements.
- Design and maintain monitoring, logging, and alerting across infrastructure and applications.
- Optimize observability using tools such as AWS, Sentry, Grafana, Airflow, and Kafka.
- Ensure alerts are actionable and dashboards provide real-time operational visibility.
- Lead incident response and on-call coordination, including severity classification and real-time resolution.
- Own post-incident reviews and corrective action tracking.
- Monitor and report on MTTR, incident trends, and system availability.
- Partner with engineering teams to improve resilience and fault tolerance.
- Serve as the primary operational liaison between Dev Ops, Engineering, Product, and Support.
- Provide clear, concise incident and operational summaries to leadership.
- Improve Dev Ops workflows, incident tracking, and documentation using Jira and Confluence.
- Ensure client-impacting issues are prioritized, resolved, and communicated effectively.
- 5+ years of experience in Dev Ops, production engineering, or related roles.
- Prior experience leading or acting as a senior technical owner for production systems.
- Strong hands-on experience with AWS and production monitoring/alerting.
- Proven experience supporting high-availability, customer-facing platforms.
- Strong written and verbal communication skills.
- Experience in fintech, payments, or regulated environments.
- Familiarity with event-driven architectures (e.g., Kafka).
- Experience with CI/CD, automation, and infrastructure-as-code.
- Experience owning on-call rotations, SLAs, and reliability metrics.
This role includes biannual profit-sharing bonuses as part of a total compensation package, in addition to a full range of medical, dental, retirement planning, and other benefits.
- Salary range: $85,000 - $115,000
- Salary including biannual profit-sharing bonus and comprehensive benefits program range (annualized): $109,725 - $ 145,725
This role requires the employee to work fully onsite at our Anaheim Hills location.
Why Work Here?- Awarded Top Workplace of Orange County by the OC Register!
- Flex PTO!
- New state-of-the-art, open-concept facility with stand-up desks, balance boards, stationary bikes, and more!
- Work hard, play hard culture!
- Monthly Beer Socials and BBQs!
- Proven "promote from within" mentality!
- Benefit offerings:
- Medical, dental, vision, acupuncture, and chiropractic
- 401k Safe Harbor; 100% employer match processed semi-monthly, up to 4%
- Profit Sharing; paid on a biannual basis
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).