TY - GEN
T1 - Automatic Entity Recognition and Typing in Massive Text Corpora
AU - Ren, Xiang
AU - El-Kishky, Ahmed
AU - Wang, Chi
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2016 owner/author(s).
PY - 2016/4/11
Y1 - 2016/4/11
N2 - In today's computerized and information-based society, we are soaked with vast amounts of natural language text data, ranging from news articles, product reviews, advertisements, to a wide range of user-generated content from social media. To turn such massive unstructured text data into actionable knowledge, one of the grand challenges is to gain an understanding of entities and the relationships between them. In this tutorial, we introduce data-driven methods to recognize typed entities of interest in different kinds of text corpora (especially in massive, domain-specific text corpora). These methods can automatically identify token spans as entity mentions in text and label their types (e.g., people, product, food) in a scalable way. We demonstrate on real datasets including news articles and yelp reviews how these typed entities aid in knowledge discovery and management.
AB - In today's computerized and information-based society, we are soaked with vast amounts of natural language text data, ranging from news articles, product reviews, advertisements, to a wide range of user-generated content from social media. To turn such massive unstructured text data into actionable knowledge, one of the grand challenges is to gain an understanding of entities and the relationships between them. In this tutorial, we introduce data-driven methods to recognize typed entities of interest in different kinds of text corpora (especially in massive, domain-specific text corpora). These methods can automatically identify token spans as entity mentions in text and label their types (e.g., people, product, food) in a scalable way. We demonstrate on real datasets including news articles and yelp reviews how these typed entities aid in knowledge discovery and management.
KW - entity recognition and typing
KW - massive text corpora
UR - http://www.scopus.com/inward/record.url?scp=85047801459&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85047801459&partnerID=8YFLogxK
U2 - 10.1145/2872518.2891065
DO - 10.1145/2872518.2891065
M3 - Conference contribution
AN - SCOPUS:85047801459
T3 - WWW 2016 Companion - Proceedings of the 25th International Conference on World Wide Web
SP - 1025
EP - 1028
BT - WWW 2016 Companion - Proceedings of the 25th International Conference on World Wide Web
PB - Association for Computing Machinery
T2 - 25th International Conference on World Wide Web, WWW 2016
Y2 - 11 May 2016 through 15 May 2016
ER -