×
Register Here to Apply for Jobs or Post Jobs. X

API Production Reliability Engineer

Job in New York, New York County, New York, 10261, USA
Listing for: Citi
Full Time position
Listed on 2026-05-31
Job specializations:
  • IT/Tech
    Cloud Computing, IT Support, Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Location: New York

API Production Support Engineer - Officer

Apply (opens in new window)

Job Req :

Location(s):

Mississauga, Ontario, Canada

Job Type:

Hybrid

Posted:

May. 11, 2026

Discover your future at Citi

Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Job Overview

At Citi, we’re passionate about building and maintaining highly reliable APIs that solve critical customer problems. We support mission-critical systems, empowering our customers with a rich feature set, high availability, and stellar performance levels to pursue their financial transactions. As we continue to expand our API scope and capabilities, we are seeking an experienced and dedicated API Production Support Engineer with complete hands-on responsibilities to ensure the operational excellence and continuous improvement of our API ecosystems.

This role requires an individual who brings fresh ideas, demonstrates a unique and informed viewpoint on API reliability, and enjoys collaborating with cross-functional teams to develop real-world solutions and ensure positive user experiences at every interaction. Our ultimate goal is to build proactive and predictive operational strategies, including leveraging intelligent automation, to avoid customer impacts.

Objectives of this Role

  • Champion stability initiatives to enable high availability and resilience for our API applications, including enhancing monitoring, failover mechanisms, and overall system health.

  • Demonstrate calm and analytical capabilities when faced with major incidents on critical API systems, ensuring effective incident, problem, and change management at a global enterprise level.

  • Perform proactive monitoring and management of production API environments, taking a holistic view of system health and performance.

  • Drive the definition, analysis, and reporting of SLIs and SLOs for all supported APIs and clients, ensuring clear performance benchmarks.

  • Contribute to the development and implementation of tools and systems designed to enhance API operational management and the client experience.

  • Measure and optimize API system performance, always pushing capabilities forward, anticipating customer needs, and innovating for continuous improvement.

  • Provide hands-on expert operational support for critical, large-scale distributed API ecosystems.

Daily and Monthly Responsibilities

  • Actively gather and analyze performance metrics from API platforms and underlying infrastructure to assist in performance tuning, fault finding, and capacity planning.

  • Partner closely with API development teams to improve services through rigorous operational feedback loops, testing, and release procedures.

  • Drive the creation of sustainable API operational systems and services through automation and continuous uplifts, including developing, testing, and debugging automated tasks.

  • Conduct thorough post-incident reviews for API-related issues, identifying opportunities for automation and proactive monitoring to prevent recurrence.

  • Actively participate in and take complete hands-on responsibility for high-priority API production support activities, ensuring swift resolution and clear communication.

Required Qualifications:

  • Extensive experience supporting Java and J2EE based applications.

  • Deep technical knowledge and hands-on experience supporting and troubleshooting environments including AWS, ECS, Oracle DB, and Mongo DB.

  • A strong understanding and practical application of SRE concepts, particularly in defining and measuring SLIs, SLOs and Error Budgets.

  • Demonstrated experience in building and utilizing comprehensive monitoring solutions such as App Dynamics, Splunk, Kibana to proactively alert on production API-related issues and ensure system health.

  • Mandatory: In-depth knowledge and hands-on experience with API Gateway technologies, specifically APIGEE, and CDN solutions like Akamai.

  • Proven ability to proactively identify and address problems, areas for improvement, and performance bottlenecks within complex API ecosystems using software-based…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary