TY - JOUR
T1 - Focused Contrastive Loss for Classification With Pre-Trained Language Models
AU - He, Jiayuan
AU - Li, Yuan
AU - Zhai, Zenan
AU - Fang, Biaoyan
AU - Thorne, Camilo
AU - Druckenbrodt, Christian
AU - Akhondi, Saber
AU - Verspoor, Karin
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2023/11/3
Y1 - 2023/11/3
N2 - Contrastive learning, which learns data representations by contrasting similar and dissimilar instances, has achieved great success in various domains including natural language processing (NLP). Recently, it has been demonstrated that incorporating class labels into contrastive learning, i.e., supervised contrastive learning (SCL), can further enhance the quality of the learned data representations. Although several works have shown empirically that incorporating SCL into classification models leads to better performance, the mechanism of how SCL works for classification is less studied. In this paper, we first investigate how SCL facilitates the classifier learning, where we show that the contrastive region, i.e., the data instances involved in each contrasting operation, has a crucial link to the mechanism of SCL. We reveal that the vanilla SCL is suboptimal since its behavior can be altered by variances in class distributions. Based on this finding, we propose a Focused Contrastive Loss (FoCL) for classification. Compared with SCL, FoCL defines a finer contrastive region, focusing on the data instances surrounding decision boundaries. We conduct extensive experiments on three NLP tasks: text classification, named entity recognition, and relation extraction. Experimental results show consistent and significant improvements of FoCL over strong baselines on various benchmark datasets, especially in few-shot scenarios.
AB - Contrastive learning, which learns data representations by contrasting similar and dissimilar instances, has achieved great success in various domains including natural language processing (NLP). Recently, it has been demonstrated that incorporating class labels into contrastive learning, i.e., supervised contrastive learning (SCL), can further enhance the quality of the learned data representations. Although several works have shown empirically that incorporating SCL into classification models leads to better performance, the mechanism of how SCL works for classification is less studied. In this paper, we first investigate how SCL facilitates the classifier learning, where we show that the contrastive region, i.e., the data instances involved in each contrasting operation, has a crucial link to the mechanism of SCL. We reveal that the vanilla SCL is suboptimal since its behavior can be altered by variances in class distributions. Based on this finding, we propose a Focused Contrastive Loss (FoCL) for classification. Compared with SCL, FoCL defines a finer contrastive region, focusing on the data instances surrounding decision boundaries. We conduct extensive experiments on three NLP tasks: text classification, named entity recognition, and relation extraction. Experimental results show consistent and significant improvements of FoCL over strong baselines on various benchmark datasets, especially in few-shot scenarios.
KW - Data mining
KW - and contrastive learning
KW - natural language processing
KW - text mining
UR - https://ieeexplore.ieee.org/abstract/document/10306323/authors#authors
U2 - 10.1109/TKDE.2023.3327777
DO - 10.1109/TKDE.2023.3327777
M3 - Article
SN - 1041-4347
VL - 36
SP - 3047
EP - 3061
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 7
ER -