More jobs:
Job Description & How to Apply Below
Key Responsibilities
Manage and maintain production environments ensuring high availability and reliability.
Perform system monitoring, performance tuning, and capacity planning.
Analyze and debug production issues by leveraging Airflow logs, Spark UI, and Hive query performance metrics.
Build and maintain dashboards and alerts in Grafana and Kibana for proactive monitoring and issue detection.
Monitor and troubleshoot OCP (Open Shift Container Platform) clusters and associated components.
Write and optimize SQL queries to analyze and troubleshoot data issues.
Collaborate with development, data engineering, and operations teams to ensure system reliability and scalability.
Participate in on-call rotations and incident management processes.
Automate routine operational tasks using scripting (Shell, Python, etc.).
Ensure adherence to best practices in observability, monitoring, and incident response.
Required Skills & Experience
4–6 years of experience as an SRE, Dev Ops Engineer, or similar role.
Strong expertise in Linux system
Solid understanding of SQL with the ability to write and optimize queries.
Good working knowledge of Hive and Spark; ability to use Spark UI for debugging performance issues.
Hands-on experience in monitoring and analyzing logs using Kibana and Grafana.
Experience in Airflow log analysis and DAG issue resolution.
Familiarity with OCP (Open Shift) or other Kubernetes-based platforms for cluster monitoring.
Strong analytical, debugging, and problem-solving skills.
Scripting skills in Shell or Python for automation.
Understanding of CI/CD and deployment best practices is a plus.
good working knowledge with querying tools like Jupyterhub,metabase
Preferred Qualifications
Experience with cloud platforms (AWS, GCP, or Azure).
Knowledge of Prometheus, Elastic Stack, or similar observability tools.
Exposure to incident management and postmortem analysis.
Familiarity with big data pipelines and distributed systems.
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×