TY - GEN
T1 - Mining knowledge from databases
T2 - 2010 International Conference on Management of Data, SIGMOD '10
AU - Han, Jiawei
AU - Sun, Yizhou
AU - Yan, Xifeng
AU - Allen, Gabrielle Dawn
PY - 2010
Y1 - 2010
N2 - Most people consider a database is merely a data repository that supports data storage and retrieval. Actually, a database contains rich, inter-related, multi-typed data and information, forming one or a set of gigantic, interconnected, heterogeneous information networks. Much knowledge can be derived from such information networks if we systematically develop an effective and scalable database-oriented information network analysis technology. In this tutorial, we introduce database-oriented information network analysis methods and demonstrate how information networks can be used to improve data quality and consistency, facilitate data integration, and generate interesting knowledge. This tutorial presents an organized picture on how to turn a database into one or a set of organized heterogeneous information networks, how information networks can be used for data cleaning, data consolidation, and data qualify improvement, how to discover various kinds of knowledge from information networks, how to perform OLAP in information networks, and how to transform database data into knowledge by information network analysis. Moreover, we present interesting case studies on real datasets, including DBLP and Flickr, and show how interesting and organized knowledge can be generated from database-oriented information networks.
AB - Most people consider a database is merely a data repository that supports data storage and retrieval. Actually, a database contains rich, inter-related, multi-typed data and information, forming one or a set of gigantic, interconnected, heterogeneous information networks. Much knowledge can be derived from such information networks if we systematically develop an effective and scalable database-oriented information network analysis technology. In this tutorial, we introduce database-oriented information network analysis methods and demonstrate how information networks can be used to improve data quality and consistency, facilitate data integration, and generate interesting knowledge. This tutorial presents an organized picture on how to turn a database into one or a set of organized heterogeneous information networks, how information networks can be used for data cleaning, data consolidation, and data qualify improvement, how to discover various kinds of knowledge from information networks, how to perform OLAP in information networks, and how to transform database data into knowledge by information network analysis. Moreover, we present interesting case studies on real datasets, including DBLP and Flickr, and show how interesting and organized knowledge can be generated from database-oriented information networks.
KW - data mining
KW - information networks
KW - link analysis
UR - http://www.scopus.com/inward/record.url?scp=77954723281&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954723281&partnerID=8YFLogxK
U2 - 10.1145/1807167.1807333
DO - 10.1145/1807167.1807333
M3 - Conference contribution
AN - SCOPUS:77954723281
SN - 9781450300322
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1251
EP - 1252
BT - Proceedings of the 2010 International Conference on Management of Data, SIGMOD '10
Y2 - 6 June 2010 through 11 June 2010
ER -