TY - JOUR
T1 - Learning in natural language
AU - Roth, Dan
N1 - Funding Information:
Acknowledgment. The authors are partially supported by the NWO Physical Sciences Free Competition project 612.001.009: Financial Events Recognition in News for Algorithmic Trading (FERNAT), the Dutch national program COMMIT, and the NWO Mozaiek scholarship project 017.007.142: Semantic Web Enhanced Product Search (SWEPS).
PY - 1999
Y1 - 1999
N2 - Statistics-based classifiers in natural language are developed typically by assuming a generative model for the data, estimating its parameters from training data and then using Bayes rule to obtain a classifier. For many problems the assumptions made by the generative models are evidently wrong, leaving open the question of why these approaches work. This paper presents a learning theory account of the major statistical approaches to learning in natural language. A class of Linear Statistical Queries (LSQ) hypotheses is defined and learning with it is shown to exhibit some robustness properties. Many statistical learners used in natural language, including naive Bayes, Markov Models and Maximum Entropy models are shown to be LSQ hypotheses, explaining the robustness of these predictors even when the underlying probabilistic assumptions do not hold. This coherent view of when and why learning approaches work in this context may help to develop better learning methods and an understanding of the role of learning in natural language inferences.
AB - Statistics-based classifiers in natural language are developed typically by assuming a generative model for the data, estimating its parameters from training data and then using Bayes rule to obtain a classifier. For many problems the assumptions made by the generative models are evidently wrong, leaving open the question of why these approaches work. This paper presents a learning theory account of the major statistical approaches to learning in natural language. A class of Linear Statistical Queries (LSQ) hypotheses is defined and learning with it is shown to exhibit some robustness properties. Many statistical learners used in natural language, including naive Bayes, Markov Models and Maximum Entropy models are shown to be LSQ hypotheses, explaining the robustness of these predictors even when the underlying probabilistic assumptions do not hold. This coherent view of when and why learning approaches work in this context may help to develop better learning methods and an understanding of the role of learning in natural language inferences.
UR - http://www.scopus.com/inward/record.url?scp=84880673754&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84880673754&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:84880673754
SN - 1045-0823
VL - 2
SP - 898
EP - 904
JO - IJCAI International Joint Conference on Artificial Intelligence
JF - IJCAI International Joint Conference on Artificial Intelligence
T2 - 16th International Joint Conference on Artificial Intelligence, IJCAI 1999
Y2 - 31 July 1999 through 6 August 1999
ER -