TY - GEN
T1 - Adding smarter systems instead of human annotators
T2 - 1st International Workshop on Search and Mining Entity-Relationship Data, SMER'11, Held at 20th ACM Conference on Information and Knowledge Management, CIKM 2011
AU - Tamang, Suzanne
AU - Ji, Heng
PY - 2011
Y1 - 2011
N2 - Using a Knowledge Base Population (KBP) slot filling task as a case study, we describe a re-ranking framework in the context of two experimental settings: (1) high transparency; a few pipelines share similar resources that can be used to provide the developer detailed intermediate answer results; (2) low transparency; many systems use diverse resources, and serve as black boxes, absent of any intermediate system results. In both settings, our results show that statistical re-ranking can effectively combine automated systems, achieving better performance than the best state-of-the-art individual system (6.6% absolute improvement in F-score) and alternative combination methods. Furthermore, to create labeled data for system development and assessment, information extraction tasks often require expensive human annotators to struggle with the vast amounts of information contained within a large scale corpus. In this paper, we demonstrate the impact of our learning-to-rank framework to combine output from multiple slot filling systems to populate entity-attribute facts in a knowledge base. We show that our approach can be used to create answer keys more efficiently and at a lower cost (63.5% reduction) than laborious human annotation.
AB - Using a Knowledge Base Population (KBP) slot filling task as a case study, we describe a re-ranking framework in the context of two experimental settings: (1) high transparency; a few pipelines share similar resources that can be used to provide the developer detailed intermediate answer results; (2) low transparency; many systems use diverse resources, and serve as black boxes, absent of any intermediate system results. In both settings, our results show that statistical re-ranking can effectively combine automated systems, achieving better performance than the best state-of-the-art individual system (6.6% absolute improvement in F-score) and alternative combination methods. Furthermore, to create labeled data for system development and assessment, information extraction tasks often require expensive human annotators to struggle with the vast amounts of information contained within a large scale corpus. In this paper, we demonstrate the impact of our learning-to-rank framework to combine output from multiple slot filling systems to populate entity-attribute facts in a knowledge base. We show that our approach can be used to create answer keys more efficiently and at a lower cost (63.5% reduction) than laborious human annotation.
KW - information extraction
KW - knowledge base population
KW - supervised re-ranking
KW - text analysis
UR - https://www.scopus.com/pages/publications/83255189771
UR - https://www.scopus.com/pages/publications/83255189771#tab=citedBy
U2 - 10.1145/2064988.2064992
DO - 10.1145/2064988.2064992
M3 - Conference contribution
AN - SCOPUS:83255189771
SN - 9781450309578
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 3
EP - 8
BT - CIKM 2011 Glasgow
Y2 - 28 October 2011 through 28 October 2011
ER -