Stress Test Evaluation of Biomedical Word Embeddings

Vladimir Araujo, Andrés Carvallo, Carlos Aspillaga, Camilo Thorne, Denis Parra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The success of pretrained word embeddings has motivated their use in the biomedical domain, with contextualized embeddings yielding remarkable results in several biomedical NLP tasks. However, there is a lack of research on quantifying their behavior under severe “stress” scenarios. In this work, we systematically evaluate three language models with adversarial examples – automatically constructed tests that allow us to examine how robust the models are. We propose two types of stress scenarios focused on the biomedical named entity recognition (NER) task, one inspired by spelling errors and another based on the use of synonyms for medical terms. Our experiments with three benchmarks show that the performance of the original models decreases considerably, in addition to revealing their weaknesses and strengths. Finally, we show that adversarial training causes the models to improve their robustness and even to exceed the original performance in some cases.

Original languageEnglish
Title of host publicationProceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021
EditorsDina Demner-Fushman, Kevin Bretonnel Cohen, Sophia Ananiadou, Junichi Tsujii
PublisherAssociation for Computational Linguistics (ACL)
Pages119-125
Number of pages7
ISBN (Electronic)9781954085404
StatePublished - 2021
Event20th Workshop on Biomedical Language Processing, BioNLP 2021 - Virtual, Online
Duration: Jun 11 2021 → …

Publication series

NameProceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021

Conference

Conference20th Workshop on Biomedical Language Processing, BioNLP 2021
CityVirtual, Online
Period06/11/21 → …

Fingerprint

Dive into the research topics of 'Stress Test Evaluation of Biomedical Word Embeddings'. Together they form a unique fingerprint.

Cite this