More jobs:
Sr. Hadoop Administrator
Job in
San Francisco, San Francisco County, California, 94199, USA
Listed on 2025-12-15
Listing for:
InfoCepts
Full Time
position Listed on 2025-12-15
Job specializations:
-
IT/Tech
Cloud Computing, Data Engineer, SRE/Site Reliability, Big Data
Job Description & How to Apply Below
The mission of the Big Data Operations team is to help teams harness the power of Big Data by providing a reliable and robust platform. We’re currently building a Next‑Gen Big Data platform on AWS, while we maintain and scale the existing platform in our data centers to meet current demands. We’re responsible for building capacity planning, security, and disaster recovery for our Next‑Gen platform in AWS.
It is very important for us to provide greater visibility into the operational telemetry of our Big Data platform via collecting logs and metrics from various sources and setting alarms accordingly to identify issues proactively rather than reacting to them.
- Design, build, scale, and maintain the infrastructure in both data center and AWS to support Big Data applications.
- Design, build, and own the end‑to‑end availability of the Big Data platform in both AWS and data center.
- Improve the efficiency, reliability, and security of our Big Data infrastructure, while ensuring a smooth experience for developers and analysts.
- Work on automation to build and maintain new platform on AWS.
- Build custom tools to automate day‑to‑day operational tasks.
- Be responsible for setting the standards for our production environment.
- Take part in 24×7 on‑call rotation with the rest of the team and respond to pages and alerts to investigate issues in our platform.
- Strong experience with Hadoop ecosystem like HDFS, Yarn, Hive, Spark, Oozie, Presto and Ranger.
- MUST have strong experience with Amazon EMR.
- Good working experience with RDS and good understanding of IaaS, PaaS.
- Strong foundation of Hadoop security including SSL/TLS, Kerberos, role‑based authorization.
- Performance tuning experience of Hadoop clusters, ecosystem components and MR/Spark jobs.
- Experience with infrastructure automation using Terraform, CI/CD pipelines (Git, Jenkins, etc.), and configuration management tools like Ansible.
- Able to leverage technologies like Kubernetes/Docker (ELK) to help our Data Engineers/Developers scale their efforts in creating new and innovative products.
- Experience with providing and implementing monitoring solutions based on logs using Cloud Watch, Cloud Trail, and Lambda.
- Ability to do post‑mortem if something bad happens to your systems; identify what went wrong and provide detailed RCA.
- Proficiency in Bash & Python or Java.
- Good understanding of all aspects of JRE/JVM and GC tuning.
- Hands‑on experience with RDBMS (Oracle, MySQL) and basic SQL.
- Hands‑on experience with Snowflake is a plus.
- Hands‑on experience with Qubole and Airflow is a plus.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×