- Have at least 2 years experience building Big Data solutions using the Hadoop ecosystem (Hadoop, HDFS,MR framework, YARN, Apache Spark, Hive QL, Sqoop etc.)
- Have experience with Spark
- Are proficient in a programming language like Java or Python or Scala
- Have knowledge and experience in Unix environment & Shell scripting
- Have RDMS DB background (SQL Server/Oracle and/or DB2) - at least 2 years
- Have experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
- Have knowledge of workflow/schedulers like Oozie or any other scheduling tool
- Coordinating the movement of data from original data sources into noSQL data lakes and cloud environments
- Hands-on experience with Talend used in conjunction of Hadoop MapReduce/Spark/Hive.
- Solid understanding of different file formats and data serialization formats such as ProtoBuf, Avro or JSON.
- Experience with Google cloud platform (Google BigQuery)
- Experience in IDE framework like Hue, Jupyter , Zepplin
- Solving Data Engineering and Data Science problems
- Development of data ingestion pipelines
- Contribution to Big Data Team across Poland