TY - JOUR
T1 - Exploring optimization of semantic relationship graph for multi-relational Bayesian classification
AU - Chen, Hailiang
AU - Liu, Hongyan
AU - Han, Jiawei
AU - Yin, Xiaoxin
AU - He, Jun
N1 - Funding Information:
This paper was supported in part by the National Natural Science Foundation of China under Grant No. 70871068, 70621061, 70890083 , and Key Laboratory of Data Engineering and Knowledge Engineering (Renmin University of China), Ministry of Education under Grant No. 2008001 .
PY - 2009/1
Y1 - 2009/1
N2 - In recent years, there has been growing interest in multi-relational classification research and application, which addresses the difficulties in dealing with large relation search space, complex relationships between relations, and a daunting number of attributes involved. Bayesian Classifier is a simple but effective probabilistic classifier which has been shown to be able to achieve good results in most real world applications. Existing works for multi-relational Naïve Bayes classifier mainly focus on how to extend traditional flat Naïve Bayes classification method to multi-relational environment. In this paper, we look into issues concerned with how to increase the accuracy of multi-relational Bayesian classifier but still retain its efficiency. We develop a Semantic Relationship Graph (SRG) to describe the relationship between multiple tables and guide the search within relation space. Afterwards, we optimize the Semantic Relationship Graph by avoiding undesirable joins between relations and eliminating unnecessary attributes and relations. The experimental study on the real-world and synthetic databases shows that the proposed optimizing strategies make the multi-relational Naïve Bayesian classifier achieve improved accuracy by sacrificing a small amount of running time.
AB - In recent years, there has been growing interest in multi-relational classification research and application, which addresses the difficulties in dealing with large relation search space, complex relationships between relations, and a daunting number of attributes involved. Bayesian Classifier is a simple but effective probabilistic classifier which has been shown to be able to achieve good results in most real world applications. Existing works for multi-relational Naïve Bayes classifier mainly focus on how to extend traditional flat Naïve Bayes classification method to multi-relational environment. In this paper, we look into issues concerned with how to increase the accuracy of multi-relational Bayesian classifier but still retain its efficiency. We develop a Semantic Relationship Graph (SRG) to describe the relationship between multiple tables and guide the search within relation space. Afterwards, we optimize the Semantic Relationship Graph by avoiding undesirable joins between relations and eliminating unnecessary attributes and relations. The experimental study on the real-world and synthetic databases shows that the proposed optimizing strategies make the multi-relational Naïve Bayesian classifier achieve improved accuracy by sacrificing a small amount of running time.
KW - Depth-first
KW - Feature selection
KW - Multi-relational classification
KW - Naïve Bayesian classification
KW - Semantic relationship graph
KW - Width-first
UR - http://www.scopus.com/inward/record.url?scp=70350574561&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70350574561&partnerID=8YFLogxK
U2 - 10.1016/j.dss.2009.07.004
DO - 10.1016/j.dss.2009.07.004
M3 - Article
AN - SCOPUS:70350574561
SN - 0167-9236
VL - 48
SP - 112
EP - 121
JO - Decision Support Systems
JF - Decision Support Systems
IS - 1
ER -