Characterizing and predicting downloads in academic search

Xinyi Li, Maarten de Rijke

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Numerous studies have been conducted on the information interaction behavior of search engine users. Few studies have considered information interactions in the domain of academic search. We focus on conversion behavior in this domain. Conversions have been widely studied in the e-commerce domain, e.g., for online shopping and hotel booking, but little is known about conversions in academic search. We start with a description of a unique dataset of a particular type of conversion in academic search, viz. users’ downloads of scientific papers. Then we move to an observational analysis of users’ download actions. We first characterize user actions and show their statistics in sessions. Then we focus on behavioral and topical aspects of downloads, revealing behavioral correlations across download sessions. We discover unique properties that differ from other conversion settings such as online shopping. Using insights gained from these observations, we consider the task of predicting the next download. In particular, we focus on predicting the time until the next download session, and on predicting the number of downloads. We cast these as time series prediction problems and model them using LSTMs. We develop a specialized model built on user segmentations that achieves significant improvements over the state-of-the art.

Original languageAmerican English
Pages (from-to)394-407
Number of pages14
JournalInformation Processing and Management
Issue number3
StatePublished - May 2019


  • Academic search
  • Download behavior
  • Download prediction
  • User segmentation


Dive into the research topics of 'Characterizing and predicting downloads in academic search'. Together they form a unique fingerprint.

Cite this