TY - GEN
T1 - Adapting text instead of the model
T2 - 15th Conference on Computational Natural Language Learning, CoNLL 2011
AU - Kundu, Gourab
AU - Roth, Dan
PY - 2011
Y1 - 2011
N2 - Natural language systems trained on labeled data from one domain do not perform well on other domains. Most adaptation algorithms proposed in the literature train a new model for the new domain using unlabeled data. However, it is time consuming to retrain big models or pipeline systems. Moreover, the domain of a new target sentence may not be known, and one may not have significant amount of unlabeled data for every new domain. To pursue the goal of an Open Domain NLP (train once, test anywhere), we propose ADUT (ADaptation Using label-preserving Transformation), an approach that avoids the need for retraining and does not require knowledge of the new domain, or any data from it. Our approach applies simple label-preserving transformations to the target text so that the transformed text is more similar to the training domain; it then applies the existing model on the transformed sentences and combines the predictions to produce the desired prediction on the target text. We instantiate ADUT for the case of Semantic Role Labeling (SRL) and show that it compares favorably with approaches that retrain their model on the target domain. Specifically, this "on the fly" adaptation approach yields 13% error reduction for a single parse system when adapting from the news wire text to fiction.
AB - Natural language systems trained on labeled data from one domain do not perform well on other domains. Most adaptation algorithms proposed in the literature train a new model for the new domain using unlabeled data. However, it is time consuming to retrain big models or pipeline systems. Moreover, the domain of a new target sentence may not be known, and one may not have significant amount of unlabeled data for every new domain. To pursue the goal of an Open Domain NLP (train once, test anywhere), we propose ADUT (ADaptation Using label-preserving Transformation), an approach that avoids the need for retraining and does not require knowledge of the new domain, or any data from it. Our approach applies simple label-preserving transformations to the target text so that the transformed text is more similar to the training domain; it then applies the existing model on the transformed sentences and combines the predictions to produce the desired prediction on the target text. We instantiate ADUT for the case of Semantic Role Labeling (SRL) and show that it compares favorably with approaches that retrain their model on the target domain. Specifically, this "on the fly" adaptation approach yields 13% error reduction for a single parse system when adapting from the news wire text to fiction.
UR - http://www.scopus.com/inward/record.url?scp=84862296344&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862296344&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84862296344
SN - 9781932432923
T3 - CoNLL 2011 - Fifteenth Conference on Computational Natural Language Learning, Proceedings of the Conference
SP - 229
EP - 237
BT - CoNLL 2011 - Fifteenth Conference on Computational Natural Language Learning, Proceedings of the Conference
Y2 - 23 June 2011 through 24 June 2011
ER -