Description
We are working on a new analytics platform that will be used to manage all of the clinical and business processes of one of the biggest pharmacies in the world. This is a big data project that involves ingesting data from various sources and streaming it into Hadoop (Hortonworks). Inside the Hadoop ecosystem, we have multiple zones through which the data is moved and transformed. Data will be analyzed using BI tools and will be used to increase patient safety and reduce mortality, increase customer satisfaction, reduce prescription processing time and optimizing supply chain management and delivering a competitive advantage.
Requirements
- Overall 1 year of programming experience with a strong commitment to develop in Big Data area
- Knowledge of Scala or Java,SQL, J2EE & databases like Oracle and/or SQL Server
- Experience with any ETL tool
- Understanding of Partitioning and Distribution concepts
- Experience with troubleshooting performance issues, SQL tuning etc
- Knowledge of Jira, Confluence and IntelliJ
- Good knowledge of English.
- Ability to work full time
Preferences
- Experience with Spark, Spark Streaming and Kafka
- Hands-on experience in developing systems/applications with: Hadoop ecosystem (Hortonworks or Cloudera), Map Reduce Pig, Hive, Sqoop and Flume
- Cassandra and HBASE
Responsibilities
- Developing data analytical platform
- Designing of Storage and Processing layers
- Developing data processing pipelines, optimizing ETL jobs, building a proper data model for the Data warehouse
- Checking the performance of the solutions
- Designing and coding efficient and effective solutions for challenging problems for large work efforts of medium complexity
- Collaborating with architects, developers and product team and present alternative approaches to complex problems
- Document techniques/approaches, process flow diagrams and data models as required