On the Adversarial Vulnerabilities of Large Language Models

Event:

Event type:

Meetup

Category:

IT

Topic:

AI/ML

Date:

27.10.2022 (thursday)

Time:

18:00

Language:

English

Price:

Free

City:

On-line

Place:

On-line

Address:

On-line

Zgłoś zmiany w wydarzeniu

Log in, by zgłosić zmianę.

Speakers:

Boxin Wang

Description:

Abstract: Large-scale pre-trained language models have achieved tremendous success across a wide range of natural language understanding (NLU) tasks, even surpassing human performance. However, the robustness of these models can be challenged by carefully crafted textual adversarial examples. We first propose an efficient and effective framework SemAttack to generate natural adversarial text by constructing different semantic perturbation functions, which optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e.g., WordNet), contextualized semantic space (e.g., the embedding space of BERT clusterings), or the combination of these spaces. We further present Adversarial GLUE (AdvGLUE), a new multi-task benchmark to quantitatively and thoroughly explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks. We hope our work will motivate the development of new adversarial attacks that are more stealthy and semantic-preserving, as well as new robust language models against sophisticated adversarial attacks.

Bio: Boxin Wang is a computer science PhD candidate at the University of Illinois at Urbana-Champaign (UIUC). He is a research assistant at Secure Learning lab led by Prof. Bo Li. He was awarded with NeurIPS 2022 Scholar Award, Yunni & Maxine Pao Memorial Fellowship, and has been selected as The Norton Labs Graduate Fellowship Finalist. He had multiple research internships at Google, Microsoft, and NVIDIA. His research interests are trustworthy natural language processing (NLP), including exploring the vulnerabilities of existing state-of-the-art ML models, as well as designing robust, private, and generalizable models for social goods. Additional information is available at https://wbx.life.

The talk will be held in English.

Log in

Przypomnij hasło

On the Adversarial Vulnerabilities of Large Language Models

Polish Natural Language Processing Meetup Group

Profile of employers

Similar events