TY - JOUR
T1 - Multi-channel, convolutional attention based neural model for automated diagnostic coding of unstructured patient discharge summaries
AU - Mayya, Veena
AU - Sowmya Kamath, S.
AU - Krishnan, Gokul S.
AU - Gangavarapu, Tushaar
N1 - Funding Information:
This research was funded by the DST-SERB, India Early Career Research Grant ECR/2017/001056, provided by the Government of India. Any findings, opinions, and recommendations or conclusions expressed in this study are those of the authors and do not reflect the views of the funding agency.
Publisher Copyright:
© 2021 Elsevier B.V.
PY - 2021/5
Y1 - 2021/5
N2 - Effective coding of patient records in hospitals is an essential requirement for epidemiology, billing, and managing insurance claims. The prevalent practice of manual coding, carried out by trained medical coders, is error-prone and time-consuming. Mitigating this labor-intensive process by developing diagnostic coding systems built on patients’ Electronic Medical Records (EMRs) is vital. However, developing nations with low digitization rates have limited availability of structured EMRs, thereby necessitating a need for systems that leverage unstructured data sources. Despite the rich clinical information available in such unstructured data, modeling them is complex, owing to the variety and sparseness of diagnostic codes, complex structural and temporal nature of summaries, and prolific use of medical jargon. This work proposes a context-attentive network to facilitate automatic diagnostic code assignment as a multi-label classification problem. The proposed model facilitates information aggregation across a patient's discharge summary via multi-channel, variable-sized convolutional filters to extract multi-granular snippets. The attention mechanism enables selecting vital segments in those snippets that map to the clinical codes. The model's superior performance underscores its effectiveness compared to the state-of-the-art on the MIMIC-III database. Additionally, experimental validation using the CodiEsp dataset exhibited the model's interpretability and explainability.
AB - Effective coding of patient records in hospitals is an essential requirement for epidemiology, billing, and managing insurance claims. The prevalent practice of manual coding, carried out by trained medical coders, is error-prone and time-consuming. Mitigating this labor-intensive process by developing diagnostic coding systems built on patients’ Electronic Medical Records (EMRs) is vital. However, developing nations with low digitization rates have limited availability of structured EMRs, thereby necessitating a need for systems that leverage unstructured data sources. Despite the rich clinical information available in such unstructured data, modeling them is complex, owing to the variety and sparseness of diagnostic codes, complex structural and temporal nature of summaries, and prolific use of medical jargon. This work proposes a context-attentive network to facilitate automatic diagnostic code assignment as a multi-label classification problem. The proposed model facilitates information aggregation across a patient's discharge summary via multi-channel, variable-sized convolutional filters to extract multi-granular snippets. The attention mechanism enables selecting vital segments in those snippets that map to the clinical codes. The model's superior performance underscores its effectiveness compared to the state-of-the-art on the MIMIC-III database. Additionally, experimental validation using the CodiEsp dataset exhibited the model's interpretability and explainability.
UR - http://www.scopus.com/inward/record.url?scp=85100018423&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85100018423&partnerID=8YFLogxK
U2 - 10.1016/j.future.2021.01.013
DO - 10.1016/j.future.2021.01.013
M3 - Article
AN - SCOPUS:85100018423
SN - 0167-739X
VL - 118
SP - 374
EP - 391
JO - Future Generation Computer Systems
JF - Future Generation Computer Systems
ER -