TY - GEN
T1 - ReactIE
T2 - 61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
AU - Zhong, Ming
AU - Ouyang, Siru
AU - Jiang, Minhao
AU - Hu, Vivian
AU - Jiao, Yizhu
AU - Wang, Xuan
AU - Han, Jiawei
N1 - Publisher Copyright:
© 2023 Association for Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design. Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts. Consequently, the scarcity of sufficient training data poses an obstacle to the progress of related models in this domain. In this paper, we propose REACTIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions. Additionally, we adopt synthetic data from patent records as distant supervision to incorporate domain knowledge into the model. Experiments demonstrate that REACTIE achieves substantial improvements and outperforms all existing baselines.
AB - Structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design. Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts. Consequently, the scarcity of sufficient training data poses an obstacle to the progress of related models in this domain. In this paper, we propose REACTIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions. Additionally, we adopt synthetic data from patent records as distant supervision to incorporate domain knowledge into the model. Experiments demonstrate that REACTIE achieves substantial improvements and outperforms all existing baselines.
UR - http://www.scopus.com/inward/record.url?scp=85174880638&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174880638&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85174880638
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 12120
EP - 12130
BT - Findings of the Association for Computational Linguistics, ACL 2023
PB - Association for Computational Linguistics (ACL)
Y2 - 9 July 2023 through 14 July 2023
ER -