Logo Crossweb

Logowanie

Nie masz konta? Zapomniałem hasła

Przypomnij hasło

close Wypełnij formularz.
Na Twój adres e-mail zostanie wysłane link umożliwiający zmianę hasła.
Wyślij
To wydarzenie już się odbyło. Sprawdź nadchodzące wydarzenia

On the Adversarial Vulnerabilities of Large Language Models

Wydarzenie:
On the Adversarial Vulnerabilities of Large Language Models
Typ wydarzenia:
Spotkanie
Kategoria:
IT
Tematyka:
Data:
27.10.2022 (czwartek)
Godzina:
18:00
Język:
angielski
Wstęp:
Bezpłatne
Miasto:
Miejsce:
On-line
Adres:
On-line
Prelegenci:
Opis:

Abstract: Large-scale pre-trained language models have achieved tremendous success across a wide range of natural language understanding (NLU) tasks, even surpassing human performance. However, the robustness of these models can be challenged by carefully crafted textual adversarial examples. We first propose an efficient and effective framework SemAttack to generate natural adversarial text by constructing different semantic perturbation functions, which optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e.g., WordNet), contextualized semantic space (e.g., the embedding space of BERT clusterings), or the combination of these spaces. We further present Adversarial GLUE (AdvGLUE), a new multi-task benchmark to quantitatively and thoroughly explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks. We hope our work will motivate the development of new adversarial attacks that are more stealthy and semantic-preserving, as well as new robust language models against sophisticated adversarial attacks.


Bio: Boxin Wang is a computer science PhD candidate at the University of Illinois at Urbana-Champaign (UIUC). He is a research assistant at Secure Learning lab led by Prof. Bo Li. He was awarded with NeurIPS 2022 Scholar Award, Yunni & Maxine Pao Memorial Fellowship, and has been selected as The Norton Labs Graduate Fellowship Finalist. He had multiple research internships at Google, Microsoft, and NVIDIA. His research interests are trustworthy natural language processing (NLP), including exploring the vulnerabilities of existing state-of-the-art ML models, as well as designing robust, private, and generalizable models for social goods. Additional information is available at https://wbx.life.


The talk will be held in English.



Profile pracodawców

Podobne wydarzenia