TY - GEN
T1 - Construction of structured heterogeneous networks from massive text data
AU - Han, Jiawei
N1 - Research was sponsored in part by the U.S. Army Research Lab. under Cooperative Agreement No. W911NF-09-2-0053 (NSCTA), National Science Foundation IIS-1320617 and IIS 16-18481, and grant 1U54GM114838 awarded by NIGMS through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative (www.bd2k.nih.gov). The views and conclusions contained in this document are those of the author(s) and should not be interpreted as representing the official policies of the U.S. Army Research Laboratory or the U.S. Government.
PY - 2017/5/14
Y1 - 2017/5/14
N2 - Network data analytics is important, powerful, and exciting. How big role may network data analytics play in the real world? Much real-world data is unstructured, in the form of natural language text. A grand challenges on big data research is to develop effective and scalable methods to turn such massive text data into actionable knowledge. In order to turn such massive unstructured, text-rich, but interconnected data into knowledge, we propose a data-to-networkto-knowledge (D2N2K) paradigm, that is, first transform data into relatively structured heterogeneous information networks, and then mine such text-rich and structure-rich heterogeneous networks to generate useful knowledge. We argue that such a paradigm represents a promising direction and network data analytics will play an essential role in transforming data to knowledge. However, a critical bottleneck in this game is mining structures from text data. We present our recent progress on developing effective methods for mining structures from massive text data and constructing structured heterogeneous information networks.
AB - Network data analytics is important, powerful, and exciting. How big role may network data analytics play in the real world? Much real-world data is unstructured, in the form of natural language text. A grand challenges on big data research is to develop effective and scalable methods to turn such massive text data into actionable knowledge. In order to turn such massive unstructured, text-rich, but interconnected data into knowledge, we propose a data-to-networkto-knowledge (D2N2K) paradigm, that is, first transform data into relatively structured heterogeneous information networks, and then mine such text-rich and structure-rich heterogeneous networks to generate useful knowledge. We argue that such a paradigm represents a promising direction and network data analytics will play an essential role in transforming data to knowledge. However, a critical bottleneck in this game is mining structures from text data. We present our recent progress on developing effective methods for mining structures from massive text data and constructing structured heterogeneous information networks.
KW - Network mining
KW - Structure mining from massive text data
UR - https://www.scopus.com/pages/publications/85021447067
UR - https://www.scopus.com/pages/publications/85021447067#tab=citedBy
U2 - 10.1145/3068943.3068944
DO - 10.1145/3068943.3068944
M3 - Conference contribution
AN - SCOPUS:85021447067
T3 - Proceedings of the 2nd ACM SIGMOD Workshop on Network Data Analytics, NDA 2017
BT - Proceedings of the 2nd ACM SIGMOD Workshop on Network Data Analytics, NDA 2017
A2 - Roy, Shourya
A2 - Bhattacharya, Arnab
A2 - Arora, Akhil
PB - Association for Computing Machinery
T2 - 2nd ACM SIGMOD Workshop on Network Data Analytics, NDA 2017
Y2 - 19 May 2017
ER -