PyData Warsaw #27
18:00 - Paweł Ekk-Cierniakowski, SoftwareOne - "Machine learning based video processing"
About topic: During my lecture, I will delve into the comprehensive 2-year journey of our project for the media house, which began with the transcription of videos and the detection of background sounds. This initiative aims to enhance content accessibility for individuals with disabilities. My presentation will cover the entire process, from the initial Proof-of-Concept stage, through the development of a Minimum Viable Product (MVP), and finally to the deployment in a production environment.
I will discuss various strategies to improve the quality of transcriptions and increase the accuracy of background sound detection. Additionally, I will highlight the integration of advanced features based on Azure services. These features include the creation of concise video summaries, automatic translations, and dubbing. My goal is to showcase how these enhancements contribute to making content more accessible and user-friendly.
About speaker: For over 10 years he has been professionally involved in data analytics as a data scientist, team leader and project manager. He has participated in many projects in the area of advanced data analytics, such as monitoring production lines, opinion analysis, fraud detection and price forecasting. His main experience and interests are in the pharmaceutical and healthcare industries, but he has been involved in projects in various areas such as finance, media, energy and agriculture. Currently, he is responsible for designing and implementing data solutions, mainly in the field of machine learning and artificial intelligence. He shares his knowledge as a data science trainer, lecturer and speaker at conferences. Co-author of scientific papers, mainly in the field of medicine and statistics, published e.g. in journals from the Master Journal List.
18:45 - Misha Zanka, Parsera - "How LLMs Can Transform the Landscape of Web Scraping?"
About topic: Web data extraction is often more painful than it seems — CSS selectors, shifting page structures, and endless debugging make even simple tasks frustrating. But what if you never had to touch code again? LLMs, with their ability to retrieve text and generate code, are redefining how we extract data from the web. They can handle complex extraction tasks, being agnostic to variety of websites, while web scraping provides the perfect closed-loop problem for leveraging LLMs code generation capabilities.
About speaker: Misha Zanka is a startup founder with a strong technical background in ML and AI. He began his career in ML research at places like the University of Warsaw, Cisco and Allegro before transitioning to startups, where he now focuses on automating web scraping with LLMs.
After party in Pizza przy Politechnice :)
Venue:
Centrum Innowacji Politechniki Warszawskiej, ul. Rektorska 4
Room 3.12 (3rd Floor)