Streaming topic model with Apache Flink + Using IoT Analytics To Save The Planet
- 6:00 PM - Networking
- 6:30 PM - DataKRK update and intro by Grand Parade
- 6:45 PM - Streaming topic model training and inference with Apache Flink
- 7:30 PM - Short break
- 7:45 PM - Using IoT Analytics To Save The Planet
- 8:30 PM - Q&A, Quiz, Networking
# First talk: Streaming topic model training and inference with Apache Flink by Suneel Marthi and Jörn Kottmann
How to use stateful stream processing and Flink’s Dynamic processing capabilities to continuously train topic models from unlabelled text and use such models to extract topics from the data itself. Analyzing streams of text data to extract topics is an important task for getting useful insights to be leveraged in subsequent workflows.
In this talk, we discuss a new approach to streaming topic modeling and also look at other implementations like Online LDA leveraging Apache Flink stateful streaming. We illustrate how to use Flink’s Dynamic processing capabilities to continuously train topic models from unlabelled text and use such models to extract topics from the data itself. Such topic models will be built leveraging distributed representations of words and documents.
Suneel is a Member of Apache Software Foundation and is a Committer and PMC on Apache Mahout, Apache OpenNLP, Apache Stream. He presently works as a Principal Technologist – AI/ML at Amazon Web Services. He’s previously presented at Flink Forward, Hadoop Summit Europe, Berlin Buzzwords, Machine Learning Conference and Apache Big Data in the past. He’s based out of Dulles, Virginia in the Washington DC Metro area.
Jörn is a member of the Apache Software Foundation. He contributed to Apache OpenNLP for 13 years and is PMC Chair and committer of the project. In his day jobs he used OpenNLP to process large document collections and streams, often in combination with Apache UIMA where he is a PMC member and committer as well.
# Second talk: Using IoT Analytics To Save The Planet :)
It's easy and cheap to find patterns in data using modern analytics tools and techniques. The bigger question is whether analytics can be used to change "offline" behaviour of people, hopefully for the better. It was observed that vehicle drivers with teliaSense product installed in their car who look at the "Eco-Drive" feature in the corresponding mobile app more often, used their car in a more eco-friendly manner by idling for a shorter duration per km driven. Subsequently, my team conducted an experiment on users wherein we exposed users to the Eco-Drive feature by making it more prominent, and this "caused" a subset of users to start idling less per km driven compared to both the wider population of drivers and to their own previous idle time. This interdisciplinary project merges behavioral psychology and economics, statistics, and IoT analytics. The results carry implications for tackling a wide array of challenges for us as a business, such as accident prevention and lowering of carbon emissions from transport. Furthermore, the wider learnings from such an experiment can be abstracted and applied to a number of domains, such as healthcare, finance and public services.
Aru leads the analytics team at Tantalum, a connected cars platform based in Stockholm. He holds undergrad degrees in Computer Science and Political Science from Grinnell College (USA), and went to graduate school in Economics at The Graduate Institute of International Studies (Switzerland). Prior to Tantalum, he worked in the statistics/analytics teams at the United Nations, Yahoo and Truecaller. Outside work, he enjoys reading about psychology/tech/politics, watching cricket, cooking, playing tennis and running.
Title: Streaming topic model with Apache Flink + Using IoT Analytics To Save The Planet
When: Fri, Mar 1st 2019, 6PM-9PM
Where: Grand Parade part of William Hill Office, Kotlarska 11 (Main entrance is from Kotlarska street, alongside the main route)