Explainable User Clustering in Short Text Streams.

Yukun Zhao, Shangsong Liang, Zhaochun Ren, Jun Ma, Emine Yilmaz, Maarten de Rijke

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Scopus citations

Abstract

User clustering has been studied from different angles: behavior-based, to identify similar browsing or search patterns, and content-based, to identify shared interests. Once user clusters have been found, they can be used for recommendation and personalization. So far, content-based user clustering has mostly focused on static sets of relatively long documents. Given the dynamic nature of social media, there is a need to dynamically cluster users in the context of short text streams. User clustering in this setting is more challenging than in the case of long documents as it is difficult to capture the users' dynamic topic distributions in sparse data settings. To address this problem, we propose a dynamic user clustering topic model (or UCT for short). UCT adaptively tracks changes of each user's time-varying topic distribution based both on the short texts the user posts during a given time period and on the previously estimated distribution. To infer changes, we propose a Gibbs sampling algorithm where a set of word-pairs from each user is constructed for sampling. The clustering results are explainable and human-understandable, in contrast to many other clustering algorithms. For evaluation purposes, we work with a dataset consisting of users and tweets from each user. Experimental results demonstrate the effectiveness of our proposed clustering model compared to state-of-the-art baselines.
Original languageAmerican English
Title of host publicationSIGIR '16 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
DOIs
StatePublished - 2016

Fingerprint Dive into the research topics of 'Explainable User Clustering in Short Text Streams.'. Together they form a unique fingerprint.

  • Cite this

    Zhao, Y., Liang, S., Ren, Z., Ma, J., Yilmaz, E., & de Rijke, M. (2016). Explainable User Clustering in Short Text Streams. In SIGIR '16 Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval https://doi.org/10.1145/2911451.2911522