Data Engineer

  • Gurugram
  • Nab

Purpose

To maximise the effective use and value of Marketing and Enterprise data assets by delivering on a strategic data management roadmap through a combination of:

oThe implementation of a structured and comprehensive approach to data management and data governance within Marketing (both tactical and strategic).

oEnd to end delivery and governance of data features within the Marketing data environment on (S3, Redshift, Airflow, DataBricks)

oWorking closely with business and technology teams to leverage enterprise Target State data capabilities


This role will provide critical thinking and a high degree of technical development and attention to detail. This role will have varying degrees of analysis, design, development, documentation, testing and support responsibilities.


Specific accountabilities for the role: of:

oManage the teams weekly deployment of change to PROD environments

oSupport business with monitoring of pipelines and investigation through to resolution of incident

oAnalysis of data for use in Marketing campaigns, designing solutions for data features, data modelling and code for data features.

odelivering data features per solution designs for Marketing campaigns.

oDeveloping optimal, performant queries, maintain code standards and adhere to Data Governance principles. Defining business requirements into technology teams


Essential capabilities

oAdvanced or demonstrable experience in SQL, Python,

oExperience with Source code control (GitHub) and Jenkins

oAbility to design, develop, execute, complete ETL

oAbility to articulate complex technical issues and desired outcomes of system enhancements

•Proven analytical skills and evidence-based decision making

•Excellent problem solving, troubleshooting & documentation skills

•Strong written and verbal communication skills

•Excellent collaboration and interpersonal skills

•Strong delivery focus with an active approach to quality and auditability

Desired Capabilities

•Experience with Artifactory and Jenkins for deployment of Production code

•Knowledge and exposure to Big Data technologies Hadoop stack such as HDFS, Hive, Impala, Spark etc, and cloud Big Data warehouses - RedShift, DataBricks.

•Exposure to AWS technologies including EMR, Glue, Athena, Data Pipeline, Lambda

•Exposure to technologies such as Airflow, GSC, GitHub


Experience

Hands on in SQL and its Big Data variants (Hive-QL, DataBricks ANSI, Redshift SQL)

It is expected that the role holder will most likely have the following qualifications and experience

•4-10 years technical experience (within financial services industry preferred)

•Solid experience, knowledge and skills in Data Engineering, BI/software development such as ELT/ETL, data extraction and manipulation in Data Lake/Data Warehouse/Lake House environment.

•Hands on programming experience in writing Python, SQL, Unix Shell scripts, Pyspark scripts, in a complex enterprise environment

•Experience in configuration management using Ansible/Jenkins/GIT

•Hands on experience of using AWS Services - S3,EC2, EMR, SNS, SQS, Lambda functions, Redshift

•Experience with Source Control Tools – Github or BitBucket

•Skilled in querying data from a range of data sources that store structured and unstructured data

•Knowledge or understanding of Power BI (Recommended)


Key tasks; responsibilities and challenges of this role

•Design, develop, test, deploy, maintain and improve data products

•Write well designed & high-quality testable code

•Develop/contribute to software verification plans and quality assurance procedures

•Deploy data products into Production environments

•Contribute to team estimation for delivery and expectation management for scope.

•Comply with industry standards and regulatory requirements