On the Adversarial Vulnerabilities of Large Language Models

Wydarzenie:

Typ wydarzenia:

Spotkanie

Kategoria:

IT

Tematyka:

AI/ML

Data:

27.10.2022 (czwartek)

Godzina:

18:00

Język:

angielski

Wstęp:

Bezpłatne

Miasto:

On-line

Miejsce:

On-line

Adres:

On-line

Zgłoś zmiany w wydarzeniu

Zaloguj się, by zgłosić zmianę.

Prelegenci:

Boxin Wang

Opis:

Abstract: Large-scale pre-trained language models have achieved tremendous success across a wide range of natural language understanding (NLU) tasks, even surpassing human performance. However, the robustness of these models can be challenged by carefully crafted textual adversarial examples. We first propose an efficient and effective framework SemAttack to generate natural adversarial text by constructing different semantic perturbation functions, which optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e.g., WordNet), contextualized semantic space (e.g., the embedding space of BERT clusterings), or the combination of these spaces. We further present Adversarial GLUE (AdvGLUE), a new multi-task benchmark to quantitatively and thoroughly explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks. We hope our work will motivate the development of new adversarial attacks that are more stealthy and semantic-preserving, as well as new robust language models against sophisticated adversarial attacks.

Bio: Boxin Wang is a computer science PhD candidate at the University of Illinois at Urbana-Champaign (UIUC). He is a research assistant at Secure Learning lab led by Prof. Bo Li. He was awarded with NeurIPS 2022 Scholar Award, Yunni & Maxine Pao Memorial Fellowship, and has been selected as The Norton Labs Graduate Fellowship Finalist. He had multiple research internships at Google, Microsoft, and NVIDIA. His research interests are trustworthy natural language processing (NLP), including exploring the vulnerabilities of existing state-of-the-art ML models, as well as designing robust, private, and generalizable models for social goods. Additional information is available at https://wbx.life.

The talk will be held in English.

Logowanie

Przypomnij hasło

On the Adversarial Vulnerabilities of Large Language Models

Polish Natural Language Processing Meetup Group

Profile pracodawców

Podobne wydarzenia