TY - GEN
T1 - Infant Cry Classification using Transfer Learning
AU - Anjali, Golla
AU - Sanjeev, Santosh
AU - Mounika, Akuraju
AU - Suhas, Gangireddy
AU - Reddy, G. Pradeep
AU - Kshiraja, Yarlagadda
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Infants use cry as their main tool for communication as they cannot speak. It is imperative to understand that not being able to comprehend these cries can affect the health of an infant. Parents or caretakers generally find it very difficult to interpret these cries and understand what the infant is feeling. Many approaches have been explored to tackle the problem of infant cry classification using different datasets. The intrinsic obstacle with infant cry classification is that the datasets available are small and in real-time prone to noise as the audio samples are collected in different environments. Also, to the best of our knowledge, there is no real-time system deployed for infant cry classification. In this view, this paper aims at addressing these issues and design a real-time embedded system running a deep learning model based on the Transfer Learning approach that classifies the infant cry. The Dunstan Baby Language dataset is used for this research. Various features for cry classification and potential approaches such as CNN, CNN+LSTM, Hybrid Mixed Deep Learning model and models based on Transfer Learning were considered. The best performance was attained by using the finetuned VGG16 model with an accuracy of 0.92 and F1 score of 0.92.
AB - Infants use cry as their main tool for communication as they cannot speak. It is imperative to understand that not being able to comprehend these cries can affect the health of an infant. Parents or caretakers generally find it very difficult to interpret these cries and understand what the infant is feeling. Many approaches have been explored to tackle the problem of infant cry classification using different datasets. The intrinsic obstacle with infant cry classification is that the datasets available are small and in real-time prone to noise as the audio samples are collected in different environments. Also, to the best of our knowledge, there is no real-time system deployed for infant cry classification. In this view, this paper aims at addressing these issues and design a real-time embedded system running a deep learning model based on the Transfer Learning approach that classifies the infant cry. The Dunstan Baby Language dataset is used for this research. Various features for cry classification and potential approaches such as CNN, CNN+LSTM, Hybrid Mixed Deep Learning model and models based on Transfer Learning were considered. The best performance was attained by using the finetuned VGG16 model with an accuracy of 0.92 and F1 score of 0.92.
UR - https://www.scopus.com/pages/publications/85145658657
UR - https://www.scopus.com/pages/publications/85145658657#tab=citedBy
U2 - 10.1109/TENCON55691.2022.9977793
DO - 10.1109/TENCON55691.2022.9977793
M3 - Conference contribution
AN - SCOPUS:85145658657
T3 - IEEE Region 10 Annual International Conference, Proceedings/TENCON
BT - Proceedings of 2022 IEEE Region 10 International Conference, TENCON 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE Region 10 International Conference, TENCON 2022
Y2 - 1 November 2022 through 4 November 2022
ER -