FarSight: Long-Term Disease Prediction Using Unstructured Clinical Nursing Notes

Tushaar Gangavarapu, Gokul S Krishnan, Sowmya Kamath S, Jayakumar Jeganathan

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)


Accurate risk stratification using patient data is a vital task in channeling prioritized care. Most state-of-the-art models are predominantly reliant on digitized data in the form of structured Electronic Health Records (EHRs). Those models overlook the valuable patient-specific information embedded in unstructured clinical notes, which is the prevalent medium employed by caregivers to record patients' disease timeline. The availability of such patient-specific data presents an unprecedented opportunity to build intelligent systems that provide exclusive insights into patients' disease physiology. Moreover, very few works have attempted to benchmark the performance of deep neural architectures against the state-of-the-art models on publicly available datasets. This paper presents significant observations from our benchmarking experiments on the applicability of deep learning models for the clinical task of ICD-9 code group prediction. We present FarSight, a long-term aggregation mechanism intended to recognize the onset of the disease with the earliest detected symptoms. Vector space and topic modeling approaches are utilized to capture the semantic information in the patient representations. Experiments on MIMIC-III database underscored the superior performance of the proposed models built on unstructured data when compared to structured EHR based state-of-the-art model, achieving an improvement of 19.34% in AUPRC and 5.41% in AUROC.

Original languageEnglish
JournalIEEE Transactions on Emerging Topics in Computing
Publication statusPublished - 01-01-2021

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Information Systems
  • Human-Computer Interaction
  • Computer Science Applications


Dive into the research topics of 'FarSight: Long-Term Disease Prediction Using Unstructured Clinical Nursing Notes'. Together they form a unique fingerprint.

Cite this