Senior Data Engineer Job Bangalore area,Bengaluru Karnataka India,IT/Tech

Location: Bengaluru

Experience

Required:

5 to 7 years

Your

Key Responsibilities:

Develop long-term vision for a highly scalable data platform, data management and Data Ops practices.
Design and architect data flows, data management in Hadoop or Cloud environment which are scalable, repeatable and eliminate time consuming steps
Promote Data Ops approach to automate the provision of data, testing and monitoring and to shorten development cycles and increase deployment frequency
Establish development and data governance processes to build mature data pipelines, CI/CD, test coverages, etc.
Evaluate, provide insights and recommendations on tools and technology strategy for analytics data platforms and applications in conjunction with Enterprise Architecture team
Ability to lead data engineering work streams with a product mindset

Who are we looking for:

Bachelor’s or master’s degree in computer science, Information Systems or equivalent field
At least 5+ years of experience in building data flows and data management on modern big data tech stack
Data Strategy :
Understands, articulates, and applies principles of the defined strategy to routine business problems that involve a single function.
Data Transformation and Integration :
Extracts data from identified databases. Creates data pipelines and transform data to a structure that is relevant to the problem by selecting appropriate techniques. Develops knowledge of current analytics trends.
Data Source Identification :
Supports the understanding of the priority order of requirements and service level agreements. Helps identify the most suitable source for data that is fit for purpose.
Demonstrates expertise in writing complex, highly optimized queries across large data sets
Strong experience in using ETL framework (eg. Airflow, Oozie, Jenkins etc.) to build and deploy production-quality ETL pipelines.
Experience in ingesting and transforming structured and unstructured data from internal and third-party sources into dimensional models.
Knowledge of data structures and distributed computing. Should be comfortable in manipulation and analysis of high-volume data from variety of internal and third-party sources.
Experience in one or more programming languages like Python or PySpark and moderate knowledge on unix scripting.
Expertise in using query languages such as SQL, No-SQL, Hive and SparkSQL.
Strong understanding of distributed storage and compute (Hive and Spark)
Experience in building stream processing jobs on Apache Spark or similar steaming analytics technology.
Experience in debugging production issues, providing root cause and implementing mitigation plan
Should be comfortable in working within Microsoft Azure on services like Azure Data Factory, Azure Synapse, Azure Dev Ops, Azure Data Bricks etc.
Understanding of basics of machine learning would be an added advantage.
Open to learn and implement new technologies and perform POC to explore best solution for the problem statement.
Retail / E-Commerce background is an advantage.
Strong sense of urgency, learning appetite and commitment