Projects per year
Abstract
User clustering has been studied from different angles. In order to identify shared interests, behaviorbased methods consider similar browsing or search patterns of users, whereas content-based methods use information from the contents of the documents visited by the users. So far, content-based user clustering has mostly focused on static sets of relatively long documents. Given the dynamic nature of social media, there is a need to dynamically cluster users in the context of streams of short texts. User clustering in this setting is more challenging than in the case of long documents, as it is difficult to capture the users' dynamic topic distributions in sparse data settings. To address this problem, we propose a dynamic user clustering topic model (UCT). UCT adaptively tracks changes of each user's time-varying topic distributions based both on the short texts the user posts during a given time period and on previously estimated distributions. To infer changes, we propose a Gibbs sampling algorithm where a set of word pairs from each user is constructed for sampling. UCT can be used in two ways: (1) as a short-term dependency model that infers a user's current topic distribution based on the user's topic distributions during the previous time period only, and (2) as a long-term dependency model that infers a user's current topic distributions based on the user's topic distributions during multiple time periods in the past. The clustering results are explainable and humanunderstandable, in contrast to many other clustering algorithms. For evaluation purposes, we work with a dataset consisting of users and tweets from each user. Experimental results demonstrate the effectiveness of our proposed short-term and long-term dependency user clustering models compared to state-of-the-art baselines.
| Original language | American English |
|---|---|
| Article number | 10 |
| Pages (from-to) | 1-37 |
| Journal | ACM Transactions on Information Systems |
| Volume | 36 |
| Issue number | 1 |
| DOIs | |
| State | Published - 2017 |
Keywords
- Ad hoc retrieval
- Data streams
- Diversity
Fingerprint
Dive into the research topics of 'Inferring Dynamic User Interests in Streams of Short Texts for User Clustering'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Dynamic User Interests
Liang, S. (CoI), Ren, Z. (CoI), Zhao, Y. (CoI), Yilmaz, E. (CoI), Kanoulas, E. (CoI), Ma, J. (CoI), De Rijke, M. (CoI) & Hobby, M. (CoI)
08/1/15 → 07/1/19
Project: Research