TY - GEN
T1 - Context sensitive paraphrasing with a global unsupervised classifier
AU - Connor, Michael
AU - Roth, Dan
PY - 2007
Y1 - 2007
N2 - Lexical paraphrasing is an inherently context sensitive problem because a word's meaning depends on context. Most paraphrasing work finds patterns and templates that can replace other patterns or templates in some context, but we are attempting to make decisions for a specific context. In this paper we develop a global classifier that takes a word v and its context, along with a candidate word u, and determines whether u can replace v in the given context while maintaining the original meaning. We develop an unsupervised, bootstrapped, learning approach to this problem. Key to our approach is the use of a very large amount of unlabeled data to derive a reliable supervision signal that is then used to train a supervised learning algorithm. We demonstrate that our approach performs significantly better than state-of-the-art paraphrasing approaches, and generalizes well to unseen pairs of words.
AB - Lexical paraphrasing is an inherently context sensitive problem because a word's meaning depends on context. Most paraphrasing work finds patterns and templates that can replace other patterns or templates in some context, but we are attempting to make decisions for a specific context. In this paper we develop a global classifier that takes a word v and its context, along with a candidate word u, and determines whether u can replace v in the given context while maintaining the original meaning. We develop an unsupervised, bootstrapped, learning approach to this problem. Key to our approach is the use of a very large amount of unlabeled data to derive a reliable supervision signal that is then used to train a supervised learning algorithm. We demonstrate that our approach performs significantly better than state-of-the-art paraphrasing approaches, and generalizes well to unseen pairs of words.
UR - http://www.scopus.com/inward/record.url?scp=38049166101&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38049166101&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-74958-5_13
DO - 10.1007/978-3-540-74958-5_13
M3 - Conference contribution
AN - SCOPUS:38049166101
SN - 9783540749578
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 104
EP - 115
BT - Machine Learning
PB - Springer
T2 - 18th European Conference on Machine Learning, ECML 2007
Y2 - 17 September 2007 through 21 September 2007
ER -