At this time, we are unable to consider candidates requiring visa sponsorship or third-party recruitment agencies for this role. We thank you for your understanding.
OverviewRole is specifically a Hadoop Platform Engineer. Engineer will have deep, hands‑on experience across the Apache Hadoop ecosystem, focused on building and maintaining a reliable, scalable data platform; strong core Hadoop and Java expertise, enabling them to diagnose, optimize, and tune performance at the cluster and infrastructure levels. This role will also drive platform observability improvements, including standardizing monitoring, implementing health checks, and developing automated alerting systems—all to proactively identify and resolve issues before they impact users.
Responsibilities- Design, build, and maintain a reliable, scalable, and high-performance Hadoop platform that supports large-scale data processing and analytics workloads.
- Diagnose and optimize cluster‑level performance issues across core Apache Hadoop components (HDFS, YARN, Map Reduce, Hive, Spark, HBase) using deep Hadoop and Java expertise.
- Develop and standardize monitoring and observability frameworks for the Hadoop ecosystem, ensuring proactive detection of system health issues.
- Leverage strong Linux expertise (especially Ubuntu) to analyze system bottlenecks, perform kernel tuning, and optimize resource allocation for Hadoop workloads.
- Implement automated health checks and alerting systems to reduce operational noise and minimize reliance on user‑reported issues.
- Collaborate with data engineering, platform, and infrastructure teams to tune resource utilization, improve job efficiency, and ensure cluster stability.
- Establish and enforce operational standards for performance tuning, capacity planning, and version upgrades across the Hadoop platform.
- Automate repetitive operational tasks and improve workflow efficiency using scripting languages (Python, Bash, or Scala).
- Maintain and enhance data security, governance, and compliance practices within the Hadoop ecosystem.
- Drive root cause analysis (RCA) and develop preventive measures for recurring production incidents.
- Document best practices, operational runbooks, and configuration standards to ensure knowledge sharing and consistent platform management.
- 10+ years of professional experience in data engineering or related roles.
- Strong expertise in Apache Hadoop.
- Proficiency with Apache Hive and Apache Spark.
- Hands‑on experience with Hadoop Distributed File System (HDFS).
- Advanced programming skills in Java.
- Solid understanding of Linux environments.
- Experience with Grafana for monitoring and visualization.
- Knowledge of Python for scripting and automation.
- Familiarity with Trino (formerly Presto
SQL).
York Solutions Offers a generous benefits package for eligible full‑time employees:
- BCBS Medical with 3 Plans to choose from (PPO and High deductible PPO plans with Health Savings Program).
- Delta Dental plan with 2 free cleanings and insurance discounts.
- Eye Med Vision with annual check‑ups and discounts on lens.
- Life and Accidental Death Insurance paid by company.
- John Hancock 401(k) Retirement Plan with discretionary company match.
- Voluntary Insurance programs such as:
Hospital Indemnity, Identity Protection, Legal Insurance, Long Term Care, and Pet Insurance. - Flexible work environment with some remote working opportunities.
- Strong fun and teamwork environment.
- Learning, development, and career growth.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).