Machine Reading, Fast and Slow: When Do Models “Understand” Language?

Wydarzenie:

Machine Reading, Fast and Slow: When Do Models...

Typ wydarzenia:

Spotkanie

Kategoria:

IT

Tematyka:

programowanie

Data:

15.12.2022 (czwartek)

Godzina:

19:00

Język:

angielski

Wstęp:

Bezpłatne

Miasto:

On-line

Miejsce:

On-line

Adres:

On-line

Zgłoś zmiany w wydarzeniu

Zaloguj się, by zgłosić zmianę.

Prelegenci:

Sagnik Ray Choudhury

Opis:

Abstract:

Two of the most fundamental challenges in Natural Language Understanding (NLU) at present are: (a) how to establish whether deep learning-based models score highly on NLU benchmarks for the `right' reasons; and (b) to understand what those reasons would even be. We investigate the behavior of reading com- prehension models with respect to two linguistic `skills': coreference resolution and comparison. We propose a definition for the reasoning steps expected from a system that would be `reading slowly', and compare that with the behavior of five models of the BERT family of various sizes, observed through saliency scores and counterfactual explanations. We find that for comparison (but not coreference) the systems based on larger encoders are more likely to rely on the `right' information, but even they struggle with generalization, sug- gesting that they still learn specific lexical patterns rather than the general principles of comparison.

Bio:

Sagnik Ray Choudhury is a research fellow at the University of Michigan working on explainable information extraction. Previously, as a postdoctoral researcher at the University of Copenhagen, he worked on the explainability of DNN models used in multi-hop reasoning systems, such as question-answering, fact-checking, and natural language inference. During his Ph.D. at Penn State, he worked on information extraction from scholarly figures and tables, information retrieval, and crawling.

Sagnik also worked in the industry as an NLP/ML engineer at Interactions LLC, a leading AI-based customer service automation company. He developed DNN models for large-scale entity extraction and linking, dialog systems, and sentiment classification and contributed to open-source DNN libraries.

Uczestnicy (1):

Alicja Gruzdz

Logowanie

Przypomnij hasło

Machine Reading, Fast and Slow: When Do Models “Understand” Language?

Polish Natural Language Processing Meetup Group

Profile pracodawców

Podobne wydarzenia