InfoNetOLAP: OLAP and mining of information networks

Chen Chen, Feida Zhu, Xifeng Yan, Jiawei Han, Philip Yu, Raghu Ramakrishnan

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, information networks have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data, and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle network structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP information networks and perform mining tasks on top of that? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze network structured data from different perspectives and with multiple granularities. In this chapter, we argue that it is critically important to OLAP such information network data and propose a novel InfoNetOLAP framework. According to this framework, given an information network data set with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the information networks can be generalized/specialized dynamically, offering multiple, versatile views of the data set. The contributions of this work are threefold. First, starting from basic definitions, i.e., what are dimensions and measures in the InfoNetOLAP scenario, we develop a conceptual framework for data cubes constructed on the information networks. We also look into different semantics of OLAP operations and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how an information network cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying graph properties of the information networks are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting, and insightful OLAP of information networks, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven InfoNetOLAP.

Original languageEnglish (US)
Title of host publicationLink Mining
Subtitle of host publicationModels, Algorithms, and Applications
PublisherSpringer New York
Pages411-438
Number of pages28
Volume9781441965158
ISBN (Electronic)9781441965158
ISBN (Print)9781441965141
DOIs
StatePublished - Jan 1 2010

Fingerprint

Information Services
Databases
Information Systems
Ice Cover
Multimedia
Semantics
Social Support

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Chen, C., Zhu, F., Yan, X., Han, J., Yu, P., & Ramakrishnan, R. (2010). InfoNetOLAP: OLAP and mining of information networks. In Link Mining: Models, Algorithms, and Applications (Vol. 9781441965158, pp. 411-438). Springer New York. https://doi.org/10.1007/978-1-4419-6515-8-16

InfoNetOLAP : OLAP and mining of information networks. / Chen, Chen; Zhu, Feida; Yan, Xifeng; Han, Jiawei; Yu, Philip; Ramakrishnan, Raghu.

Link Mining: Models, Algorithms, and Applications. Vol. 9781441965158 Springer New York, 2010. p. 411-438.

Research output: Chapter in Book/Report/Conference proceedingChapter

Chen, C, Zhu, F, Yan, X, Han, J, Yu, P & Ramakrishnan, R 2010, InfoNetOLAP: OLAP and mining of information networks. in Link Mining: Models, Algorithms, and Applications. vol. 9781441965158, Springer New York, pp. 411-438. https://doi.org/10.1007/978-1-4419-6515-8-16
Chen C, Zhu F, Yan X, Han J, Yu P, Ramakrishnan R. InfoNetOLAP: OLAP and mining of information networks. In Link Mining: Models, Algorithms, and Applications. Vol. 9781441965158. Springer New York. 2010. p. 411-438 https://doi.org/10.1007/978-1-4419-6515-8-16
Chen, Chen ; Zhu, Feida ; Yan, Xifeng ; Han, Jiawei ; Yu, Philip ; Ramakrishnan, Raghu. / InfoNetOLAP : OLAP and mining of information networks. Link Mining: Models, Algorithms, and Applications. Vol. 9781441965158 Springer New York, 2010. pp. 411-438
@inbook{475fc5e297e84134b8c3b9deeef6cd86,
title = "InfoNetOLAP: OLAP and mining of information networks",
abstract = "Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, information networks have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data, and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle network structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP information networks and perform mining tasks on top of that? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze network structured data from different perspectives and with multiple granularities. In this chapter, we argue that it is critically important to OLAP such information network data and propose a novel InfoNetOLAP framework. According to this framework, given an information network data set with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the information networks can be generalized/specialized dynamically, offering multiple, versatile views of the data set. The contributions of this work are threefold. First, starting from basic definitions, i.e., what are dimensions and measures in the InfoNetOLAP scenario, we develop a conceptual framework for data cubes constructed on the information networks. We also look into different semantics of OLAP operations and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how an information network cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying graph properties of the information networks are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting, and insightful OLAP of information networks, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven InfoNetOLAP.",
author = "Chen Chen and Feida Zhu and Xifeng Yan and Jiawei Han and Philip Yu and Raghu Ramakrishnan",
year = "2010",
month = "1",
day = "1",
doi = "10.1007/978-1-4419-6515-8-16",
language = "English (US)",
isbn = "9781441965141",
volume = "9781441965158",
pages = "411--438",
booktitle = "Link Mining",
publisher = "Springer New York",

}

TY - CHAP

T1 - InfoNetOLAP

T2 - OLAP and mining of information networks

AU - Chen, Chen

AU - Zhu, Feida

AU - Yan, Xifeng

AU - Han, Jiawei

AU - Yu, Philip

AU - Ramakrishnan, Raghu

PY - 2010/1/1

Y1 - 2010/1/1

N2 - Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, information networks have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data, and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle network structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP information networks and perform mining tasks on top of that? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze network structured data from different perspectives and with multiple granularities. In this chapter, we argue that it is critically important to OLAP such information network data and propose a novel InfoNetOLAP framework. According to this framework, given an information network data set with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the information networks can be generalized/specialized dynamically, offering multiple, versatile views of the data set. The contributions of this work are threefold. First, starting from basic definitions, i.e., what are dimensions and measures in the InfoNetOLAP scenario, we develop a conceptual framework for data cubes constructed on the information networks. We also look into different semantics of OLAP operations and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how an information network cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying graph properties of the information networks are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting, and insightful OLAP of information networks, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven InfoNetOLAP.

AB - Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, information networks have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data, and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle network structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP information networks and perform mining tasks on top of that? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze network structured data from different perspectives and with multiple granularities. In this chapter, we argue that it is critically important to OLAP such information network data and propose a novel InfoNetOLAP framework. According to this framework, given an information network data set with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the information networks can be generalized/specialized dynamically, offering multiple, versatile views of the data set. The contributions of this work are threefold. First, starting from basic definitions, i.e., what are dimensions and measures in the InfoNetOLAP scenario, we develop a conceptual framework for data cubes constructed on the information networks. We also look into different semantics of OLAP operations and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how an information network cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying graph properties of the information networks are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting, and insightful OLAP of information networks, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven InfoNetOLAP.

UR - http://www.scopus.com/inward/record.url?scp=84879851576&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84879851576&partnerID=8YFLogxK

U2 - 10.1007/978-1-4419-6515-8-16

DO - 10.1007/978-1-4419-6515-8-16

M3 - Chapter

AN - SCOPUS:84879851576

SN - 9781441965141

VL - 9781441965158

SP - 411

EP - 438

BT - Link Mining

PB - Springer New York

ER -