TY - JOUR
T1 - Collaborative filtering with information-rich and information-sparse entities
AU - Zhu, Kai
AU - Wu, Rui
AU - Ying, Lei
AU - Srikant, R.
N1 - Funding Information:
Acknowledgments Research supported in part by AFOSR grant FA 9550-10-1-0573 and NSF grants ECCS-1202065 and ECCS-1255425. We thank Prof. Olgica Milenkovic for comments on an earlier version of the paper.
PY - 2014/10
Y1 - 2014/10
N2 - In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω (M K log M) noisy entries while M K entries are necessary, where K is the number of clusters and M is the number of items. In the case of co-clustering, we prove that K2 entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.
AB - In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω (M K log M) noisy entries while M K entries are necessary, where K is the number of clusters and M is the number of items. In the case of co-clustering, we prove that K2 entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.
KW - Clustering model
KW - Collaborative filtering
KW - Matrix completion
KW - Recommender system
UR - http://www.scopus.com/inward/record.url?scp=84906946511&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84906946511&partnerID=8YFLogxK
U2 - 10.1007/s10994-014-5454-z
DO - 10.1007/s10994-014-5454-z
M3 - Article
AN - SCOPUS:84906946511
SN - 0885-6125
VL - 97
SP - 177
EP - 203
JO - Machine Learning
JF - Machine Learning
IS - 1-2
ER -