Big Data Engineer Ref. JOB-042021-140232_1618796728

Drive the full lifecycle of Big Data projects: from gathering and understanding the end-user needs to implement a fully automated solution.
Develop and provision of Data pipelines to enable self-service reports and dashboards.
Deploy AI/Machine learning techniques to answer the appropriate business problems using R or Python.
Visualize data using Tableau and create repeatable visual analysis for end users to use as tools.

Description

In this role, the Big Data Engineer will be a part of the Data Sciences Innovation Lab. This team serves as the gatekeepers and curators of the BIG data collected into the Data Warehouse/Data Lake from all aspects of our fashion e-commerce business and support data-driven decision-making. He/she will conduct complex data analysis, continuously evolve management reporting, and deliver business insights in an environment of rapid growth and increasing complexity.

Responsibilities:

Drive the full lifecycle of Big Data projects: from gathering and understanding the end-user needs to implement a fully automated solution.
Develop and provision of Data pipelines to enable self-service reports and dashboards.
Deploy AI/Machine learning techniques to answer the appropriate business problems using R or Python.
Visualize data using Tableau and create repeatable visual analysis for end users to use as tools.
Take ownership of the existing BI platforms and maintain the data integrity and accuracy of the numbers and data sources.
Know Agile - Scrum project management experience/knowledge - Ability to prioritise, pushback and effectively manage a data product and sprint backlog.

Requirements:

5+ years of experience in building out scalable and reliable ETL/ELT pipelines and processes to ingest data from a variety of data sources, preferably in the ecommerce retail industry.
Experience in building robust scalable data pipelines using distributed processing frameworks (e.g. Spark, Hadoop, EMR, Flink, Storm), integrated with asynchronous messaging systems (e.g. Apache Kafka, Kinesis, PubSub, MQ Series).
Deep understanding of Relational Database Management Systems (RDBMS) (e.g. PostgreSQL, MySQL) , No-SQL Databases (e.g. MongoDB, ElasticSearch) and hands-on experience in implementation and performance tuning of MPP databases (e.g. Redshift, BigQuery).
Strong programming, algorithmic, data processing skills with significant experience in producing production-ready code in Python/Scala/Java etc, engineering experience with machine learning projects like Time Series Forecasting, Classification and Optimization problems.
Experience administering and deploying CI/CD tools (e.g. Git, Jira, Jenkins) Industrialization (e.g. Ansible, Terraform), Workflow Management ( e.g. Airflow, Jenkins, Luigi) in Linux operating system environments.
Experience designing and implementing software for Data Security, Cryptography, Data Loss Prevention (DLP), or other security tools.
Experience with Tableau, Power BI, Superset or any standard data visualization tools.
Exhibits sound business judgment, a proven ability to influence others, strong analytical skills, and a proven track record of taking ownership, leading data-driven analyses, and influencing results.
Knowledge of cloud services like AWS, GCP, Azure would be a huge added advantage.
E-commerce / logistics / fashion retail background a bonus.