Chemical reaction reference resolution in patents

Hiyori Yoshikawa, Saber Akhondi, Camilo Thorne, Christian Druckenbrodt, Ralph Hoessel, Zenan Zhai, Jiayuan He, Timothy Baldwin, Karin Verspoor

Research output: Contribution to journalConference articlepeer-review

Abstract

Many new chemical compounds are reported each year in patent documents, leading to increasing demand for methods for automatic information extraction of chemical compounds and reactions from patents. Chemical patents often detail a number of similar compounds that have a common substructure and can be synthesized in analogous ways, and therefore contain many references connecting descriptions of similar chemical reactions, to avoid redundancy in describing common reaction conditions. This leads to the problem of reaction reference resolution, where, given a reaction description, we need to identify links to other reaction descriptions it refers to. In this paper, we formally introduce the task and propose baseline methods to address it in analogy with co-reference resolution. To evaluate the performance, we create a large-scale silver-standard dataset based on a commercial database of chemical reactions. The experimental results show that the approach based on a state-of-the-art co-reference resolution method struggles to outperform a simple heuristic in detecting reference links, demonstrating the difficulty of the proposed task and its fundamentally different nature to co-reference resolution.

Original languageEnglish
Pages (from-to)10-17
Number of pages8
JournalCEUR Workshop Proceedings
Volume2909
StatePublished - 2021
Externally publishedYes
Event2nd Workshop on Patent Text Mining and Semantic Technologies, PatentSemTech 2021 - Virtual, Online
Duration: Jul 15 2021 → …

Keywords

  • Information extraction
  • Natural language processing
  • Reaction reference resolution

Fingerprint

Dive into the research topics of 'Chemical reaction reference resolution in patents'. Together they form a unique fingerprint.

Cite this