Fusion: Towards Automated ICD Coding via Feature Compression

Junyu Luo, Cao Xiao, Lucas Glass, Jimeng Sun, Fenglong Ma

Research output: Chapter in Book/Report/Conference proceedingConference contribution


ICD coding aims to automatically assign International Classification of Diseases (ICD) codes from unstructured clinical notes or discharge summaries, which saves human labor and reduces errors. Although several studies are proposed to solve this challenging task, none distinguishes the importance of different phrases with a word window. Intuitively, informative phrases should be more useful for the prediction. This paper proposes a feature compressed ICD coding model named Fusion to address this issue. In particular, we propose an attentive soft-pooling approach to compress the sparse and redundant word representations into informative and dense ones as local features. Besides, we use the key-query attention mechanism for modeling the inner relations among local features to generate the global features, which are further used to predict ICD codes. Experiments on two widely used datasets demonstrate that Fusion outperforms baselines. However, on the MIMIC-III Full dataset, we find that none of the state-of-the-art approaches significantly perform better than others. Thus, automated ICD coding is still a challenging task.

Original languageEnglish (US)
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationACL-IJCNLP 2021
EditorsChengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
PublisherAssociation for Computational Linguistics (ACL)
Number of pages6
ISBN (Electronic)9781954085541
StatePublished - 2021
EventFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online
Duration: Aug 1 2021Aug 6 2021

Publication series

NameFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021


ConferenceFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021
CityVirtual, Online

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language


Dive into the research topics of 'Fusion: Towards Automated ICD Coding via Feature Compression'. Together they form a unique fingerprint.

Cite this