Contrastive learning, which learns data representations by contrasting similar and dissimilar instances, has achieved great success in various domains including natural language processing (NLP). Recently, it has been demonstrated that incorporating class labels into contrastive learning, i.e., supervised contrastive learning (SCL), can further enhance the quality of the learned data representations. Although several works have shown empirically that incorporating SCL into classification models leads to better performance, the mechanism of how SCL works for classification is less studied. In this paper, we first investigate how SCL facilitates the classifier learning, where we show that the contrastive region, i.e., the data instances involved in each contrasting operation, has a crucial link to the mechanism of SCL. We reveal that the vanilla SCL is suboptimal since its behavior can be altered by variances in class distributions. Based on this finding, we propose a Focused Contrastive Loss (FoCL) for classification. Compared with SCL, FoCL defines a finer contrastive region, focusing on the data instances surrounding decision boundaries. We conduct extensive experiments on three NLP tasks: text classification, named entity recognition, and relation extraction. Experimental results show consistent and significant improvements of FoCL over strong baselines on various benchmark datasets, especially in few-shot scenarios.
|IEEE Transactions on Knowledge and Data Engineering
|Published - Nov 3 2023