TY - GEN
T1 - A probabilistic mixture model for mining and analyzing product search log
AU - Duan, Huizhong
AU - Zhai, Chengxiang
AU - Cheng, Jinxing
AU - Gattani, Abhishek
PY - 2013
Y1 - 2013
N2 - The booming of e-commerce in recent years has led to the generation of large amounts of product search log data. Product search log is a unique new data with much valuable information and knowledge about user preferences over product attributes that is often hard to obtain from other sources. While regular search logs (e.g., Web search logs) contain click-throughs for unstructured text documents (e.g., web pages), product search logs contain clickth-roughs for structured entities defined by a set of attributes and their values. For instance, a laptop can be defined by its size, color, cpu, ram, etc. Such structures in product entities offer us opportunities to mine and discover detailed useful knowledge about user preferences at the attribute level, but they also raise significant challenges for mining due to the lack of attribute-level observations. In this paper, we propose a novel probabilistic mixture model for attribute-level analysis of product search logs. The model is based on a generative process where queries are generated by a mixture of unigram language models defined by each attribute-value pair of a clicked entity. The model can be efficiently estimated using the Expectation- Maximization (EM) algorithm. The estimated parameters, including the attribute-value language models and attribute-value preference models, can be directly used to improve product search accuracy, or aggregated to reveal knowledge for understanding user intent and supporting business intelligence. Evaluation of the proposed model on a commercial product search log shows that the model is effective for mining and analyzing product search logs to discover various kinds of useful knowledge.
AB - The booming of e-commerce in recent years has led to the generation of large amounts of product search log data. Product search log is a unique new data with much valuable information and knowledge about user preferences over product attributes that is often hard to obtain from other sources. While regular search logs (e.g., Web search logs) contain click-throughs for unstructured text documents (e.g., web pages), product search logs contain clickth-roughs for structured entities defined by a set of attributes and their values. For instance, a laptop can be defined by its size, color, cpu, ram, etc. Such structures in product entities offer us opportunities to mine and discover detailed useful knowledge about user preferences at the attribute level, but they also raise significant challenges for mining due to the lack of attribute-level observations. In this paper, we propose a novel probabilistic mixture model for attribute-level analysis of product search logs. The model is based on a generative process where queries are generated by a mixture of unigram language models defined by each attribute-value pair of a clicked entity. The model can be efficiently estimated using the Expectation- Maximization (EM) algorithm. The estimated parameters, including the attribute-value language models and attribute-value preference models, can be directly used to improve product search accuracy, or aggregated to reveal knowledge for understanding user intent and supporting business intelligence. Evaluation of the proposed model on a commercial product search log shows that the model is effective for mining and analyzing product search logs to discover various kinds of useful knowledge.
KW - Probabilistic mixture model
KW - Product search
KW - Search log mining
UR - http://www.scopus.com/inward/record.url?scp=84889599023&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84889599023&partnerID=8YFLogxK
U2 - 10.1145/2505515.2505578
DO - 10.1145/2505515.2505578
M3 - Conference contribution
AN - SCOPUS:84889599023
SN - 9781450322638
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2179
EP - 2188
BT - CIKM 2013 - Proceedings of the 22nd ACM International Conference on Information and Knowledge Management
T2 - 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013
Y2 - 27 October 2013 through 1 November 2013
ER -