TY - GEN
T1 - RecTree
T2 - 3rd International Conference on Data Warehousing and Knowledge Discovery, DaWaK 2001
AU - Chee, Sonny Han Seng
AU - Han, Jiawei
AU - Wang, Ke
N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 2001.
PY - 2001
Y1 - 2001
N2 - Many people rely on the recommendations of trusted friends to find restaurants or movies, which match their tastes. But, what if your friends have not sampled the item of interest? Collaborative filtering (CF) seeks to increase the effectiveness of this process by automating the derivation of a recommendation, often from a clique of advisors that we have no prior personal relationship with. CF is a promising tool for dealing with the information overload that we face in the networked world. Prior works in CF have dealt with improving the accuracy of the predictions. However, it is still challenging to scale these methods to large databases. In this study, we develop an efficient collaborative filtering method, called RecTree (which stands for RECommendation Tree) that addresses the scalability problem with a divide-and-conquer approach. The method first performs an efficient k-means-like clustering to group data and creates neighborhood of similar users, and then performs subsequent clustering based on smaller, partitioned databases. Since the progressive partitioning reduces the search space dramatically, the search for an advisory clique will be faster than scanning the entire database of users. In addition, the partitions contain users that are more similar to each other than those in other partitions. This characteristic allows RecTree to avoid the dilution of opinions from good advisors by a multitude of poor advisors and thus yielding a higher overall accuracy. Based on our experiments and performance study, RecTree outperforms the well-known collaborative filter, CorrCF, in both execution time and accuracy. In particular, RecTree’s execution time scales by O(nlog2(n)) with the dataset size while CorrCF scales quadratically.
AB - Many people rely on the recommendations of trusted friends to find restaurants or movies, which match their tastes. But, what if your friends have not sampled the item of interest? Collaborative filtering (CF) seeks to increase the effectiveness of this process by automating the derivation of a recommendation, often from a clique of advisors that we have no prior personal relationship with. CF is a promising tool for dealing with the information overload that we face in the networked world. Prior works in CF have dealt with improving the accuracy of the predictions. However, it is still challenging to scale these methods to large databases. In this study, we develop an efficient collaborative filtering method, called RecTree (which stands for RECommendation Tree) that addresses the scalability problem with a divide-and-conquer approach. The method first performs an efficient k-means-like clustering to group data and creates neighborhood of similar users, and then performs subsequent clustering based on smaller, partitioned databases. Since the progressive partitioning reduces the search space dramatically, the search for an advisory clique will be faster than scanning the entire database of users. In addition, the partitions contain users that are more similar to each other than those in other partitions. This characteristic allows RecTree to avoid the dilution of opinions from good advisors by a multitude of poor advisors and thus yielding a higher overall accuracy. Based on our experiments and performance study, RecTree outperforms the well-known collaborative filter, CorrCF, in both execution time and accuracy. In particular, RecTree’s execution time scales by O(nlog2(n)) with the dataset size while CorrCF scales quadratically.
UR - http://www.scopus.com/inward/record.url?scp=84958760418&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84958760418&partnerID=8YFLogxK
U2 - 10.1007/3-540-44801-2_15
DO - 10.1007/3-540-44801-2_15
M3 - Conference contribution
AN - SCOPUS:84958760418
SN - 3540425535
SN - 9783540425533
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 141
EP - 151
BT - Data Warehousing and Knowledge Discovery - 3rd International Conference, DaWaK 2001, Proceedings
A2 - Winiwarter, Werner
A2 - Kambayashi, Yahiko
A2 - Arikawa, Masatoshi
PB - Springer
Y2 - 5 September 2001 through 7 September 2001
ER -