TY - GEN
T1 - HIT
T2 - 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020
AU - Wang, Yu
AU - Li, Yun
AU - Tong, Hanghang
AU - Zhu, Ziye
N1 - This work was partially supported by Natural Science Foundation of China (No.61772284), Postgraduate Research & Practice Innovation Program of Jiangsu Province (SJKY19 0763). And the Hanghang Tong author is partially supported by NSF (1947135, 1715385, and 1939725).
This work was partially supported by Natural Science Foundation of China (No.61772284), Post- graduate Research & Practice Innovation Program of Jiangsu Province (SJKY19 0763). And the Hanghang Tong author is partially supported by NSF (1947135, 1715385, and 1939725 ).
PY - 2020
Y1 - 2020
N2 - Named Entity Recognition (NER) is a fundamental task in natural language processing. In order to identify entities with nested structure, many sophisticated methods have been recently developed based on either the traditional sequence labeling approaches or directed hypergraph structures. Despite being successful, these methods often fall short in striking a good balance between the expression power for nested structure and the model complexity. To address this issue, we present a novel nested NER model named HIT. Our proposed HIT model leverages two key properties pertaining to the (nested) named entity, including (1) explicit boundary tokens and (2) tight internal connection between tokens within the boundary. Specifically, we design (1) Head-Tail Detector based on the multi-head self-attention mechanism and bi-affine classifier to detect boundary tokens, and (2) Token Interaction Tagger based on traditional sequence labeling approaches to characterize the internal token connection within the boundary. Experiments on three public NER datasets demonstrate that the proposed HIT achieves state-of-the-art performance.
AB - Named Entity Recognition (NER) is a fundamental task in natural language processing. In order to identify entities with nested structure, many sophisticated methods have been recently developed based on either the traditional sequence labeling approaches or directed hypergraph structures. Despite being successful, these methods often fall short in striking a good balance between the expression power for nested structure and the model complexity. To address this issue, we present a novel nested NER model named HIT. Our proposed HIT model leverages two key properties pertaining to the (nested) named entity, including (1) explicit boundary tokens and (2) tight internal connection between tokens within the boundary. Specifically, we design (1) Head-Tail Detector based on the multi-head self-attention mechanism and bi-affine classifier to detect boundary tokens, and (2) Token Interaction Tagger based on traditional sequence labeling approaches to characterize the internal token connection within the boundary. Experiments on three public NER datasets demonstrate that the proposed HIT achieves state-of-the-art performance.
UR - http://www.scopus.com/inward/record.url?scp=85107674865&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85107674865&partnerID=8YFLogxK
U2 - 10.18653/v1/2020.emnlp-main.486
DO - 10.18653/v1/2020.emnlp-main.486
M3 - Conference contribution
AN - SCOPUS:85107674865
T3 - EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
SP - 6027
EP - 6036
BT - EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
Y2 - 16 November 2020 through 20 November 2020
ER -