Large scale weak supervision & Scalable recommendations in a hybrid environment

Event:

Event type:

Meetup

Category:

IT

Topic:

database , Big Data

Date:

02.03.2020 (monday)

Time:

18:00

Language:

English

Price:

Free

City:

Kraków

Place:

Community Hub Kraków

Address:

Podwale 3

Zgłoś zmiany w wydarzeniu

Log in, by zgłosić zmianę.

Speakers:

Suneel Marthi

Mikolaj Kromka

Agenda:

18:00 Networking
18:30 Large Scale Weak Supervision with Snorkel and Apache Beam by Suneel Marthi
19:30 Break
19:45 Scalable recommendations in a hybrid environment by Mikolaj Kromka

Description:

1. Large Scale Weak Supervision with Snorkel and Apache Beam

The advent of Deep Learning models has led to a massive growth of real-world machine learning. The models models rely on massive hand-labeled training datasets which is a bottleneck in developing and modifying machine learning models.

Most large scale Machine Learning systems today like Google’s DryBell use some form of Weak Supervision to construct lower quality, large scale training datasets that can be used to continuously retrain and deploy models in a real-world scenario.

The challenge with continuous retraining is that one needs to maintain prior state (e.g., the learning functions in case of Weak Supervision or a pre-trained model like BERT or Word2Vec for Transfer Learning) that is shared across multiple streams, while continuously updating the model. Apache Beam’s Stateful Stream processing capabilities are a perfect match here including support for scalable Weak Supervision.

The audience would come away with a better understanding of how Weak Supervision with Apache Beam’s stateful stream processing can be used to accelerate the labeling of training data, and real-time training and update of machine learning models.

Bio:

Suneel is a Member of Apache Software Foundation and is a Committer and PMC on Apache Mahout, Apache OpenNLP, Apache Stream. He presently works as a Principal Technologist – AI/ML at Amazon Web Services. He’s previously presented at Flink Forward, Hadoop Summit Europe, Berlin Buzzwords, Machine Learning Conference and Apache Big Data in the past. He’s based out of Dulles, Virginia in the Washington DC Metro area.

https://twitter.com/saarw/status/1223922504235999232/photo/1

2. Scalable recommendations in a hybrid environment

How to develop projects using Machine Learning on Big Data? Is it possible to scale Python code and if yes then what is the cost? How to set up cooperation between engineers and data scientists? Answers to these and many other questions will be provided during the presentation based on experience in creating analytical platform for one of the biggest retailers in the world - Tesco. The main theme will be personalised product recommendation system assisting, in real time, millions of British clients.

Bio:

Mikołaj Kromka is a Software Development Manager at VirtusLab, currently involved in projects using Machine Learning on Big Data, where he helps to parallelize and run the code in hybrid, cluster-cloud environment. Spark and pySpark woskhops trainer. Privately CS PhD student at AGH in Kraków, climber and explorer Cracow's museums.

Log in

Przypomnij hasło

Large scale weak supervision & Scalable recommendations in a hybrid environment

DataKRK

Profile of employers

Similar events