Job Description & How to Apply Below
This job is with Kyndryl, an inclusive employer and a member of my Gwork – the largest global platform for the LGBTQ+ business community. Please do not contact the recruiter directly.
Who We Are
At Kyndryl, we run and reimagine the mission-critical technology systems that drive advantage for the world's leading businesses. We are at the heart of progress; with proven expertise and a continuous flow of AI-powered insight, enabling smarter decisions, faster innovation, and a lasting competitive edge. For our people-Kyndryls-that means doing purposeful work that powers human progress. Join us and experience a flexible, supportive environment where your well-being is prioritized and your potential can thrive.
The Role
Role Summary
A Senior Kafka Operations Engineer is responsible for ensuring the stability, performance, and reliability of Kafka-based data streaming platforms in production. The role focuses on end-to-end operational support, advanced troubleshooting, and enabling development teams to build resilient, high-performing Kafka integrations.
This is a hands-on operational role, working in a 24/7 support environment, with deep involvement in Kafka clients, cluster health, and real-time incident resolution.
What You Will Do
Own production support for Kafka environments in a 24/7 on-call rotation
Monitor and maintain Kafka cluster performance, availability, and reliability
Perform advanced troubleshooting across the full Kafka stack:
Producers, consumers, brokers, and clusters
Analyze logs and metrics to proactively detect and resolve issues
Ensure minimal downtime and uninterrupted data flow
Deep-Dive Troubleshooting Areas
Kafka Clients Producer delivery failures, retries, idempotence, acknowledgments
Consumer lag, offset issues, delivery guarantees
Connectivity & Security TLS handshake failures
SASL authentication issues
Schema & Serialization Schema compatibility problems
Serializer/deserializer failures
Performance Slow producers/consumers
Throughput bottlenecks (e.g., compression, batching)
Cluster Health Partition hot spots
Broker performance issues
Replication/reliability concerns
Collaboration & Impact
Support and guide development teams on Kafka best practices
Help onboard applications onto Kafka
Act as a subject matter expert during incidents and root cause analysis
Improve system resilience and operational efficiency
What Makes This a Senior Role
Deep understanding of distributed systems and Kafka internals
Ability to troubleshoot complex, multi-layer issues under pressure
Strong communication with both engineering and non-engineering teams
Ownership of business-critical production environments
Who You Are
Role Summary
A Senior Kafka Operations Engineer is responsible for ensuring the stability, performance, and reliability of Kafka-based data streaming platforms in production. The role focuses on end-to-end operational support, advanced troubleshooting, and enabling development teams to build resilient, high-performing Kafka integrations.
This is a hands-on operational role, working in a 24/7 support environment, with deep involvement in Kafka clients, cluster health, and real-time incident resolution.
What You Will Do
Own production support for Kafka environments in a 24/7 on-call rotation
Monitor and maintain Kafka cluster performance, availability, and reliability
Perform advanced troubleshooting across the full Kafka stack:
Producers, consumers, brokers, and clusters
Analyze logs and metrics to proactively detect and resolve issues
Ensure minimal downtime and uninterrupted data flow
Deep-Dive Troubleshooting Areas
Kafka Clients Producer delivery failures, retries, idempotence, acknowledgments
Consumer lag, offset issues, delivery guarantees
Connectivity & Security TLS handshake failures
SASL authentication issues
Schema & Serialization Schema compatibility problems
Serializer/deserializer failures
Performance Slow producers/consumers
Throughput bottlenecks (e.g., compression, batching)
Cluster Health Partition hot spots
Broker performance issues
Replication/reliability concerns
Collaboration & Impact
Support and guide development teams on Kafka best practices
Help onboard applications onto Kafka
Act as a subject matter…
Position Requirements
10+ Years
work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×