Logo Crossweb

Log in

No account yet? Forgot password

Przypomnij hasło

close Wypełnij formularz.
Na Twój adres e-mail zostanie wysłane link umożliwiający zmianę hasła.
Send
This event has already taken place. Check upcoming events

New generation integration (NiFi, Kylo) and Spark SQL internals

Event:
New generation integration (NiFi, Kylo) and Spark SQL internals
Event type:
Meetup
Category:
IT
Topic:
Date:
06.09.2017 (wednesday)
Time:
18:00
Language:
English
Price:
Free
City:
Place:
Barka Alrina
Address:
Bulwar Kurlandzki na wysokości ulicy Gazowej, obok Kładki Bernatka
Description:

Dear DataKRKers,

Soon, we are hosting another event where we have two great presentations confirmed:

  • New generation data integration tools: NiFi and Kylo

Abstract:

Many enterprise organizations lack the expertise to make the transition from traditional data warehousing strategies to operationalized big data. To assist with this issue, companies have started using a multitude of newer generation big data integration tools. In this session, we will explore two such tools: NiFi and Kylo. Apache NiFi comes from the NSA project NiagaraFiles, and supports powerful and scalable directed graphs of data routing, transformation, and system mediation logic — essentially, flow-based programming for big data. Kylo is a data lake management software platform for self-service data ingest and data preparation with integrated metadata management, governance, security and best practices inspired by Think Big’s 150+ big data implementation projects. 

Bio: 

Nicholas Fish works with Think Big Analytics, a Teradata Company, helping companies gain insights about their businesses from the many datasets they have accrued over time. When he’s not wrangling data in Hive, he can be found in Copenhagen, Denmark, usually either riding his bike or walking his dog Bobbie.

  • Spark SQL internals, debugging and optimization

Abstract:

 In recent years Apache Spark has received a lot of hype in the Big Data community. It is seen as a silver bullet for all problems related to gathering, processing and analysing massive datasets. Due to its rapid evolution (do not forget that Spark is one the most active open source projects), some of the ideas behind it seem to be unclear and require digging into different blog posts and presentations. During this talk we will dive into the internals of Spark SQL, look how our queries are translated to the actual code executed on the nodes and find different ways to debug and optimize them. 

Bio:

Mikołaj Kromka is a Software engineer and Spark trainer at VirtusLab, focused on finding new connections between Scala, Functional Programming and Big Data. In his spare time likes to analyse complex networks, take photos and explore Cracow's museums. 

Profile of employers

Similar events