Is your document novel? Let attention guide you. An attention-based model for document-level novelty detection

Tirthankar Ghosal, Vignesh Edithal, Asif Ekbal, Pushpak Bhattacharyya, Srinivasa Satya Sameer Kumar Chivukula, George Tsatsaronis

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

Detecting, whether a document contains sufficient new information to be deemed as novel, is of immense significance in this age of data duplication. Existing techniques for document-level novelty detection mostly perform at the lexical level and are unable to address the semantic-level redundancy. These techniques usually rely on handcrafted features extracted from the documents in a rule-based or traditional feature-based machine learning setup. Here, we present an effective approach based on neural attention mechanism to detect document-level novelty without any manual feature engineering. We contend that the simple alignment of texts between the source and target document(s) could identify the state of novelty of a target document. Our deep neural architecture elicits inference knowledge from a large-scale natural language inference dataset, which proves crucial to the novelty detection task. Our approach is effective and outperforms the standard baselines and recent work on document-level novelty detection by a margin of 3% in terms of accuracy.

Original languageEnglish
Pages (from-to)427-454
Number of pages28
JournalNatural Language Engineering
Volume27
Issue number4
DOIs
StatePublished - Jul 2021
Externally publishedYes

Keywords

  • Decomposable Attention
  • Document Classification
  • Document-Level Novelty Detection
  • Natural Language Inference

Fingerprint

Dive into the research topics of 'Is your document novel? Let attention guide you. An attention-based model for document-level novelty detection'. Together they form a unique fingerprint.

Cite this