TY - GEN
T1 - On dataless hierarchical text classification
AU - Song, Yangqiu
AU - Roth, Dan
PY - 2014/1/1
Y1 - 2014/1/1
N2 - In this paper, we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training, dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels, we show that understanding the labels can be used to accurately categorize the documents. This is done by embedding both labels and documents in a semantic space that allows one to compute meaningful semantic similarity between a document and a potential label. We show that this scheme can be used to support accurate multiclass classification without any supervision. We study several semantic representations and show how to improve the classification using bootstrapping. Our results show that bootstrapped dataless classification is competitive with supervised classification with thousands of labeled examples.
AB - In this paper, we systematically study the problem of dataless hierarchical text classification. Unlike standard text classification schemes that rely on supervised training, dataless classification depends on understanding the labels of the sought after categories and requires no labeled data. Given a collection of text documents and a set of labels, we show that understanding the labels can be used to accurately categorize the documents. This is done by embedding both labels and documents in a semantic space that allows one to compute meaningful semantic similarity between a document and a potential label. We show that this scheme can be used to support accurate multiclass classification without any supervision. We study several semantic representations and show how to improve the classification using bootstrapping. Our results show that bootstrapped dataless classification is competitive with supervised classification with thousands of labeled examples.
UR - http://www.scopus.com/inward/record.url?scp=84908216690&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908216690&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84908216690
T3 - Proceedings of the National Conference on Artificial Intelligence
SP - 1579
EP - 1585
BT - Proceedings of the National Conference on Artificial Intelligence
PB - AI Access Foundation
T2 - 28th AAAI Conference on Artificial Intelligence, AAAI 2014, 26th Innovative Applications of Artificial Intelligence Conference, IAAI 2014 and the 5th Symposium on Educational Advances in Artificial Intelligence, EAAI 2014
Y2 - 27 July 2014 through 31 July 2014
ER -